THE PRINCIPLE OF THE LARGE SIEVE 



E. KOWALSKI 
Pour les soixantes ans de J-M. Deshouillers 

Abstract. We describe a very general abstract form of sieve based on a large sieve inequality 
which generalizes both the classical sieve inequality of Montgomery (and its higher-dimensional 
variants), and our recent sieve for Frobenius over function fields. The general framework sug- 
gests new applications. We give some first results on the number of prime divisors of "most" 
elements of an elliptic divisibility sequence, and we develop in some detail "probabilistic" sieves 
for random walks on arithmetic groups, e.g., estimating the probability of finding a reducible 
characteristic polynomial at some step of a random walk on SL{n, Z). In addition to the sieve 
principle, the applications depend on bounds for a large sieve constant. To prove such bounds 
involves a variety of deep results, including Property (r) or expanding properties of Cayley 
graphs, and the Riemann Hypothesis over finite fields. 
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1. Introduction 

Classical sieve theory is concerned with the problem of the asymptotic evaluation of averages 
of arithmetic functions over integers constrained by congruence restrictions modulo a set of 
primes. Often the function in question is the characteristic function of some interesting sequence 
and the congruence restrictions are chosen so that those integers remaining after the sieving 
process are, for instance, primes or "almost" primes. 

If the congruence conditions are phrased as stating that the only integers n which are allowed 
are those with reduction modulo a prime p not in a certain set 0,p, then a familiar dichotomy 
arises: if Qp contains few residue classes (typically, a bounded number as p increases), the setting 
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is that of a "small" sieve. The simplest such case is the detection of primes with ^Ip = {0}. If, 
on the other hand, the size of fip increases, the situation is that of a "large" sieve. The first 
such sieve was devised by Linnik to investigate the question of Vinogradov of the size of the 
smallest quadratic non-residue modulo a prime. 

There have already been a number of works extending "small" sieves to more general si- 
tuations, where the objects being sifted are not necessarily integers. Among these, one might 
quote the vector sieve of Briidern and Fouvry |BFj . with applications to Lagrange's theorem 
with almost prime variables, the "crible etrange" of Fouvry and Michel |FMj . with applications 
to sign changes of Kloosterman sums, and Poonen's striking sieve procedure for finding smooth 
hypersurfaces of large degree over finite fields |Poj . 

Similarly, the large sieve has been extended in some ways, in particular (quite early on) to 
deal with sieves in Z'^, d ^ 1, or in number fields (see e.g. jUj). Interesting applications have 
been found, e.g. Duke's theorem on elliptic curves over Q with "maximal" p-torsion fields for 
all p jDj- All these were much of the same flavor however, and in particular depended only 
on the character theory of finite abelian groups as far as the underlying harmonic analysis was 
concerned^. 

In |Kolj . we have introduced a new large sieve type inequality for the average distribution 
of Frobenius conjugacy classes in the monodromy groups of a family {J^i) of F^-adic sheaves on 
a variety over a finite field. Although the spirit of the large sieve is clearly recognizable, the 
setting is very different, and the harmonic analysis involves both non-abelian finite groups, and 
the deep results of Deligne on the Riemann Hypothesis over finite fields. Our first application 
of this new sieve was related to the "generic" arithmetic behavior of the numerator of the zeta 
function of a smooth projective curve in a family with large monodromy, improving significantly 
a result of Chavdarov |Chj . 

Motivated by this first paper, the present one is interested with foundational issues related 
to the large sieve. We are able to describe a very general abstract framework which we call "the 
principle of the large sieve", with a pun on |Moj . This leads to a sieve statement that may in 
particular be specialized to either the classical forms of the large sieve, or to a strengthening 
of |Kolj . Roughly speaking, we deal with a set X that can be mapped to finite sets Xi (for 
instance, integers can be reduced modulo primes) and we show how an estimate for the number 
of those X & X which have "reductions" outside C Xi for all or some i may be reduced 
to a bilinear form estimate of a certain kind. The form of the sieve statement we obtain is 
similar to Montgomery's formulation of the large sieve (see e.g. |Moj . [B] . |IK1 7.4]). It should 
be mentioned that our "axioms" for the sieve may admit other variations. In fact, Zywina [Z] 
has developed a somewhat similar framework, and some of the flexibility we allow was first 
suggested by his presentation. 

There remains the problem of estimating the bilinear form. The classical idea of duality and 
exponential sums is one tool in this direction, and we describe it also somewhat abstractly. We 
then find a convincing relation with the classical sieve axioms, related to equidistribution in the 
finite sets X£. 

The bilinear form inequality also seemingly depends on the choice of an orthonormal basis 
of certain finite-dimensional Hilbert spaces. It turns out that in many applications, the sieve 
setting is related to the existence of a group G such that Xi is the set of conjugacy classes in a 
finite quotient of G and the reduction X ^ X£ factors through G. In that case, the bilinear 
form inequality can be stated with a distinguished basis arising from the representation theory 
(or harmonic analysis) of the finite groups Gg. 

This abstract sieving framework has many incarnations. As we already stated, we can recover 
the classical large sieve and the "sieve for Frobenius" of |Kolj . but furthermore, we are led to 

^ There is, of course, an enormously important body of work concerning inequalities traditionally called "large 
sieve inequalities" for coefficients of automorphic forms of various types which have been developed by Iwaniec, 
Deshouillers-Iwaniec, Duke, Duke-Kowalski, Venkatesh and others (a short survey is in HEl §7.7]). However, those 
generalize the large sieve inequality for Dirichlet characters, and have usually no relation (except terminological) 
with the traditional sieve principle. 
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a number of situations which are either new (to the author's knowledge), or have received 
attention only recently, although not in the same form in general. One of these concerns (small) 
sieves in arithmetic groups and is the subject of ongoing work of Bourgain, Gamburd and 
Sarnak (BGS , and some of the problems it is suited for have been raised and partly solved by 
Rivin who also emphasized possible applications to some groups which are "close in spirit" 
to arithmetic groups, such as mapping class groups of surfaces or automorphism groups of free 
groups. Indeed, the large sieve strengthens significantly the results of Rivin (see Corollary 19. 7|) . 

Our main interest in writing this paper is the exploration of the general setting. Consequently, 
the paper is fairly open-ended and has a distinctly chatty style. We hope to come back to some 
of the new examples with more applications in the future. Still, to give a feeling for the type 
of results that become available, we finish this introduction with a few sample statements (the 
last one could in fact have been derived in |Kolj . with a slightly worse bound). 

Theorem 1.1. Let (Sn) be a simple random walk on Z, i.e., 

Sn = Xi + • • • + Xn 

where (Xk) is a sequence of independent random variables with P{Xk = ±1) = ^ for all k. 
Let £ > be given, e ^ 1/4. For any odd g ^ 1, any a coprime with q, we have 

P(5„ is prime and = a{modq)) <^ 



ip{q) logn 

ifn^l,q^ ri^l^~^ , the implied constant depending only on e. 

Theorem 1.2. Let n ^ 2 be an integer, let G = SL{n, Z) and let S = C G be a finite 
generating set of G, e.g., the finite set of elementary matrices with ±1 entries off the diagonal. 
Let (Xfc) be the simple left-invariant random walk on T, i.e., a sequence of T-valued random 
variables such that Xq = 1 and 

Xk+i = Xkik+i for ^ 0, 

where {^k) is a sequence of S-valued independent random variables with P{S,k — s) — ]^\ /^'^ 
all s G S. Then, almost surely, there are only finitely many k for which the characteristic 
polynomial det{Xk — T) G 'Zi[T] is reducible, or in other words, the set of matrices with reducible 
characteristic polynomials in SL{n, Z) is transient for the random walk. 

In fact (see Theorem l9.4() . we will derive this by showing that the probability that det{X k — T) 
be reducible decays exponentially fast with k (in the case n ^ 3 at least). An analogue of this 
result (with some extra conditions) has the geometric/topological consequence that the set of 
non-pseudo-Anosov elements is transient for random walks on mapping class groups of closed 
orientable surfaces, answering a question of Maher [Ma\ Question 1.3] (see Corollarv 19.71 for 
details; this application was suggested by Rivin's paper .R]). 

Theorem 1.3. Let n 3 be an integer, let G = SL(n,Z), and let S = S^^ C G be a finite 
symmetric generating set. Then there exists /? > such that for any N ^ 1, we have 

\{w G I one entry of the matrix g^ is a square\\ <^ \S\'^^^'~^\ 

where g^ = si ■ ■ ■ sn for w = (si, . . . , sn) G S^, and (3 and the implied constant depend only 
on n and S. 

Equivalently, for the random walk (Xk) on G defined as in the statement of the previous 
theorem, we have 

P(one entry of the matrix Xk is a square) <C exp(—6k) 
for k ^ I and some constant 6 > 0, where 6 and the implied constant depend only on n and S. 
Theorem 1.4. Let E/Q be an elliptic curve with rank r ^ 1 given by a Weierstrass equation 
+ aixy + a^y = x^ + a2x'^ + a^x + ag, where Oi G Z. 



For X E E(Q,), letujE{x) be the number of primes, without multiplicity, dividing the denominator 
of the coordinates of x, with u;_e(0) = +cxd. Let h(x) denote the canonical height on E. 
Then for any fixed real number k with < k < 1 , we have 

\{x £ E{Q) I h{x) ^ T and uje{x) < KloglogT}] « r''/2(log log r)"\ 

for r ^ 3, where the implied constant depends only on E and k. 

Theorem 1.5. Let q be a power of a prime number p ^ 5, g ^ 1 an integer and let f € Fg[T] be 
a squarefree polynomial of degree 2g. For t not a zero of f, let Ct denote the smooth projective 
model of the hyperelliptic curve 

y^ = f{x)ix-t), 
and let Jt denote its Jacobian variety. Then we have 

\{t £ Fq I fit) / and \Ct{Fg)\ is a square}\ « g^"^(logg), 

\{t G Fg I f{t) / and \ Jt{Fg)\ is a square}\ < q^''^{logq) 

where 7 = (4(7^ + 2g + A)"^ , and the implied constants are absolute. 

It is well-known that the strong form of the large sieve is as efficient (qualitatively) as the 
best small sieves, as far as upper bound sieves are concerned. To put this in context, we will 
briefly recall the principles of small sieves (in the same abstract context) in an Appendix, and 
we will give a sample application (Theorem IA.3|) related to Theorem 11.51 

The plan of this paper is as follows. In the first sections, the abstract sieve setting is described, 
and the abstract large sieve inequality is derived; this is a pleasant and rather straightforward 
algebraic exercise. In Sections H] and IHl we specialize the general setting to two cases ("group 
sieve" and "coset sieve") related to group theory, using the representation theory of finite groups. 
This leads to the natural problem of finding precise estimates for the degree and the sum of 
degrees of irreducible representations of some finite groups of Lie type, which we consider in 
some cases in Sectional For this we use Deligne-Lusztig characters, and arguments shown to 
the author by J. Michel; this section may be omitted in a first reading. 

Turning to examples of sieves, already in Section 13 we show how many classically-known uses 
of the large sieve are special cases of the setting of Section 0] In the same section, we also 
indicate the relation with the inclusion-exclusion technique in probability and combinatorics, 
which shows in particular that the general sieve bound is sharp (see Example 15. 6|) . 

New (or emerging) situations are considered next, in four sections which are quite independent 
of one another (all of them involve either group or coset sieves). "Probabilistic" sieves are 
discussed briefly in Sectional leading to Theorem ll.il Sieving in arithmetic groups is described 
in Section 1^1 where Theorem 11.21 is proved. The crucial point (as in the work of Bourgain, 
Gamburd and Sarnak) is the expanding properties of Cayley graphs of 5L(n, Z/dZ), phrased 
in terms of Property (r). Then comes an amusing "elliptic sieve" which is related to the 
number of prime divisors of the denominators of rational points on an elliptic curve, leading to 
Theorem 11.41 In turn, this is linked to the analysis of the prime factorization of elements of 
so-called "elliptic divisibility sequences", and we find that "most" elements have many prime 
factors. This complements recent heuristics and results of Silverman, Everest, Ward and others 
concerning the paucity of primes and prime powers in such sequences. Finally, in Section [TTl we 
extend the sieve result of |Kolj concerning the distribution of geometric Frobenius conjugacy 
classes in finite monodromy groups over finite fields, and derive some new applications. To 
conclude. Appendix A briefly indicates the link with small sieve situations, for the purpose of 
comparison and reference, with a sample application, and Appendix B contains the proofs of 
some "local" density computations in matrix groups over finite fields. Those estimates have been 
used previously, but we defer the proof to not distract from the main thrust of the arguments 
underlying the principle of the sieve. Note that the techniques underlying those computations 
are in fact quite advanced and of independent interest, and involve work of Chavdarov |Chj and 
non-trivial estimates for exponential sums over finite fields. 
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Notation. As usual, \X\ denotes the cardinality of a set; however if X is a measure space 
with measure n, we sometimes write \X\ instead of fi{X). 

For a group G, G" denotes the set of its conjugacy classes, and for a conjugacy-invariant 
subset X C G, X^ C C is the corresponding set of conjugacy classes. The conjugacy class of 
g € G is denoted gK 

By f <^ g for x £ X, or f = 0{g) for x G X, where X is an arbitrary set on which / is 
defined, we mean synonymously that there exists a constant C ^ such that ^ Cg{x) 

for all X £ X. The "implied constant" is any admissible value of C. It may depend on the 
set X which is always specified or clear in context. The notation f ^ g means f <^ g and 
g <^ f- On the other hand f{x) = o{g{x)) as x — > xq is a topological statement meaning that 
f{x)/g{x) ^ as X — > xq. 

For n ^ 1 an integer, uj{n) is the number of primes dividing n, without counting multiplicity. 
For z G C, we denote e(z) = exp(2z7rz). 

In probabilistic contexts, P(^) is the probability of an event, E(X) is the expectation of a 
random variable X, V(X) its variance, and \a is the characteristic function of an event A. 

Acknowledgments. D. Zywina has developed [Zj an abstract setup of the large sieve similar 
to the conjugacy sieve described in Section^ His remarks have been very helpful both for the 
purpose of straightening out the assumptions used, and as motivation for the search of new 
"unusual" applications. One of his nice tricks (the use of general sieve support) is also used 
here. The probabilistic setting was suggested in part by Rivin's preprint who also mentioned 
to me the work of Bourgain, Sarnak and Gamburd. I also wish to thank P. Sarnak for sending 
me a copy of his email Sa I to his coauthors. Finally, I thank J. Michel for providing the 
ideas of the proof of Proposition 17.31 and explaining some basic properties of representations of 
finite groups of Lie type, and P. Duchon and M-L. Chabanol for help, advice and references 
concerning probability theory and graph theory. 

2. The principle of the large sieve 

We will start by describing a very general type of sieve. The goal is to reach an analogue 
of the large sieve inequality, in the sense of a reduction of a sieve bound to a bilinear form 
estimate. 

We start by introducing the notation and terminology. The sieve setting is a triple ^ = 
(y, A, {pi)) consisting of 

• A set Y; 

• An index set A; 

• For all £ G A, a surjective map pi^ : Y where is a finite set. 

In combinatorial terms, this might be thought as a family of colorings of the set Y . In 
applications, A will often be a subset of primes (or prime ideals in some number field), but as 
first pointed out by Zywina, this is not necessary for the formal part of setting up the sieve, 
and although the generality is not really abstractly greater, it is convenient to allow arbitrary 
A. 

Then, a siftable set associated to ^' = {Y,A, {pe)) is a triple T = {X,p,F) consisting of 

• A measure space {X,p) with p{X) < +oo; 

• A map F : X ^ Y such that the composites X ^ Y ^ Yi are measurable, i.e., the 
sets {x G A I pi{Fx) = y} are measurable for all I and all y £ Y^. 

The simplest case is when A is a finite set and p is counting measure. We call this the 
counting case. Even when this is not the case, for notational convenience, we will usually write 
\B\ = p{B) for the measure of a measurable set B <Z X. 

The last piece of data is a finite subset C* of A, called the prime sieve support, and a family 

= (J7^) of sieving sets^ of 1^, defined for t £ L* . 

With this final data (^,T,£*,ri), we can define the sieve problem. 



Sometimes, 57 will also denote a probability space, but no confusion should arise. 
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Definition 2.1. Let ^ = (1", A, {pi)) be a sieve setting, T = (X, /u, F) a siftable set, C* a prime 
sieve support and 17 a family of sieving sets. Then the sifted sets are 

S{Y, n; C*) = {yeY \ pe{y) i for ah I e £*}, 
0; £*) = {x e X I ^ for all £ € £*}. 

The latter is also F^^{S{Y^ fi; £*)) and is a measurable subset of X. 

The problem we will consider is to find estimates for the measure |5'(X, $7; £*)| of the sifted 
set. Here we think that the sieve setting is fixed, while there usually will be an infinite sequence 
of siftable sets with size \X\ going to infinity; this size will be the main variable in the estimates. 

Example 2.2. The classical sieve arises as follows: the sieve setting is 

^ = (Z, {primes}, Z ^ Z/£Z) 

and the siftable sets are X = {n \ M < n ^ M + N} with counting measure and = x for 
X & X. Then the sifted sets become the classical sets of integers in an interval with reductions 
modulo primes in C* lying outside a subset C 'L/d.'L of residue classes. 

In most cases, (X, p) will be a finite set with counting measure, and often X ClY with F^ = x 
for X G X. See Section ^2 for a conspicuous example where F is not the identity. Section |H1 for 
interesting situations where the measure space (X, /i) is a probability space, and F a random 
variable, and Section for another example. 

We will now indicate one type of inequality that reduces the sieve problem to the estimation 
of a large sieve constant A. The latter is a more analytic problem, and can be attacked in 
a number of ways. This large sieve constant depends on most of the data involved, but is 
independent of the sieving sets. 

First we need some more notation. Given a sieve setting we let S{h) denote the set of 
finite subsets m C A. Since S'(A) may be identified with the set of squarefree integers m ^ 1 
in the classical case where A is the set of primes, to simplify notation we write £ | m for i £ m 
when £ £ A and m G S{A), and similarly for n \ m instead of n C m if n, m G 'S'(A). 

A sieve support C associated to a prime sieve support C* is any finite subset of S'(A) such 
that 

(2.1) £ G m, m G £ implies £ e C* , and {£} G £ if ^ G C*. 

This implies that C determines C* (as the set of elements of singletons in C). If A is a set 
of primes, C "is" a set of squarefree integers only divisible by primes in C* and containing C* 
(including possibly m = 1, not divisible by any prime). 

For m G S'(A), let 

and let pm '■ Y ^ Ym be the obvious product map. (In other words, we look at all "refined" 
colorings of Y obtained by looking at all possible finite tuples of colorings). If m = 0, Y^ is a 
set with a single element, and pm is a constant map. 

We will consider functions on the various sets Y^, and it will be important to endow the 
space of complex-valued functions on Y^ with appropriate and consistent inner products. For 
this purpose, we assume given for ^ G A a density 

i^i -.Ye^ [0, 1] 

(often denoted simply i/ when no ambiguity is possible) such that the inner product on functions 
/ : ^ C is given by 
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We assume that z^(y) > for all y G Y^, in order that this hermitian form be positive definite 
(it will be clear that z/(y) ^ would suffice, but the stronger assumption is no problem for 
applications), and that u is a probability density, i.e., we have 

(2.2) Yl ^^(y^ = 1- 

yeYe 

Using the product structure we define corresponding inner products and measures on the 
spaces of functions Ym — > C. Property (|2.2() still holds. We will interpret u as a measure on Yi 
or Ym, so we will write for instance 

i^i^e) = v{y), for Vt^ C Y^. 

We denote by L^{Ym) the space of complex-valued functions on Y^ with the inner product 
thus defined. 

The simplest example is when i/(?/) = l/ly^li but see Sections and IHl for important natural 
cases where i' is not uniform. It will be clear in the remarks and sections following the statement 
of the sieve inequality that, in general, the apparent choice of vi is illusory (only one choice will 
lead to good results). 

Note that pm is not necessarily surjective, but it turns out to be true, and a crucial fact, in 
most applications of the sieve, so we make a definition (the terminology will be clearer in later 
applications). 

Definition 2.3. A sieve setting ^ = (Y, A, {pi)) is linearly disjoint if the map pm '■ Y Y^ is 
onto for all m G S{A). 

Here is now the first sieve inequality. 

Proposition 2.4. Let ^, T, C* be as above. For any sieve support C associated to C* , i.e, any 
finite subset of S{A) satisfying (|2.1|) . let A = A(X, £) denote the large sieve constant, which is 
by definition the smallest non-negative real number such that 

for any square integrable function a : X — > C, where ip in the outer sum ranges over B^, 
where = Be — {1} , Be is an orthonormal basis, containing the constant function 1, of the 
space L'^(Ye), and for all m we let 

B.m = \{Be, Bl, = \{B*e, 

£\m £\m 

the function on Ym corresponding to {(pe) being given by 

(ye) ^Yi'Peiye), 

£\m 

and for m = 0, we have Bm = B^ = {I}- 

Then for arbitrary sieving sets Q = {0,e), we have 

|5(X,0;£*)| ^ AH-^ 

where 

me££|m ^ ' m£C i\m ^ ' 

Remark 2.5. The large sieve constant as defined above is independent of the choices of basis Be 
(containing the constant function 1). Here is a more intrinsic definition which shows this, and 
provides a first hint of the link with classical (small) sieve axioms. It's not clear how much this 
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intrinsic definition can be useful in practice, whicli explains why we kept a concrete version in 
the statement of Proposition 12.41 

By definition, the inequality (|2.3() means that A is the square of the norm of the linear 
operator 

a ^ (f ^ a{x)f{pmiF^))dfi{x)] 

\ V / m 



T < 



X 

2/ 



where the direct sum over m is orthogonal and L{LQ{Ym), C) is the space of linear functionals 
on 

^o(^m) = (g) LliYe), where lI{Y,) = {/ € L^{Ye) \ (/, 1) = Hy)fiy) = 0} 
e\m y 

(the space Lq(Yjyi) may be thought of as the "primitive" subspace of the functions on Im)) with 
the norm 

II i"* II \/ 5 / / 

Since we are dealing with Hilbert spaces, L{LQ{Ym),C) is canonically isometric to LQ{Ym), 
and A is the square of the norm of the operator 



a ^ Ti{a) 



where Ti(a) is the vector such that {f,Ti{a)) = T{a){f) for / G LQ{Yra), m G C This vector 
is easy to identify: we have 

/ o:{x)f{pm{F^))dn{x)='S2f(y)\ a{x)dp{x)) 



which means that Ti{a) is the complex-conjugate of the projection to Lg(l^) of the function 

y ^ — I a{x)dp{x) 

on Ym- For m = {£}, this projection is obtained by subtracting the contribution of the constant 
function, i.e., subtracting the average over y: it is 



y 



— -f—- [ a{x)dfi{x) — [ a{x)dfi{x) 

a{x)dfi{x) — / a{x)dfi{x). 



^iy) J{pn^{F^)=y} 

In the case of counting measure and a uniform density v, this becomes the quantity 



Ye 

Pm{F^)=y ^ 



after multiplying by iy{y), which is a typical "error term" appearing in sieve axioms. 

To prove Proposition 12. 4| we start with two lemmas. For m £ S{A), y G Ym, an element ip 
of the basis Bm, and a square-integrable function a G -L^(X, /i), we denote 



(2.5) S{m,y) = / a{x)dfi{x), and S{ip) = / a{x)ip{pm{Fx))dfi{x), 

J{Pm{F^)=y} JX 

where the integral is defined because p{X) < +oo by assumption. The first lemma is the 
following: 



Lemma 2.6. We have for all i ^ A the relation 



X 



a{x)dfj,{x] 



X JX 



Proof. Expanding the square by Fubini's Theorem, the left-hand side is 

a{x)a{y) ^ ip{pe{F^))(p{pe{Fy))dfi{x)dfi{y). 

Since [(p)^p,^Si is an orthonormal basis of the space of functions on Y^, expanding the deha 
function z ^ 6{y, z) in the basis gives 



Taking on the right-hand side the contribution of the constant function 1, we get in particular 

^{pi{F,)Mpe{Fy)) = -l-6{F,,Fy) - 1. 

Inserting this in the first relation, we obtain 



a{x)a{y) 
{F^=Fy} ^iFx) 



dp{x)dp{y) 



X JX 



a{x)a{y)d(j,{x)dijL{y) 



a{x)a{y)dp{x)dp{y) 



a{x)dp{x] 



as desired. 

Here is the next lemma. 



\S{i,y)\' 



a{x)dp{x) 



X 



□ 



Lemma 2.7. Let (^', T, $7, iZ*) be as above, and let C be any sieve support associated to C* . 
For any square-integrable function x a{x) on X supported on the sifted set S{X, Q;C*) C X , 
and for any m G C, we have 



X 



I TT 

a{x)dp{x)\ II 



E 1^(^)1' ^ 

where S{(p) is given by (|2.5() . 

Proof. Since this does not change the sifted set, we may replace C if necessary by the full power 
set of £*. Then, as in the classical case (see e.g. |IKl Lemma 7.15]), the proof proceeds by 
induction on the number of elements in m. If m = 0, the inequality is trivial (there is equality, 
in fact). If m = {£} with £ G A (in the arithmetic case, m is a prime), then i € C* hy (|2.1I) . 
Using Cauchy's inequality and the definition of the sifted set with the assumption on a{x) to 
restrict the support of integration to elements where pe{Fx) ^ Qi, we obtain: 



X 



a{x)dp{x) 



j:Sii,y) <(E-(^))(E 



\S{l,yf 



^ky) 
W.y)? 



y&i 



u{Yi - f],){ I^MI' + I / a{x)dp{x)^] 
•pes; •'^ 



(by Lemma l'2.6() . hence the result by moving | J a{x)dfi\'^ on the left-hand side, since I'iYi) = 1. 

The induction step is now immediate, relying on the fact that the function a is arbitrary and 
the sets are "multiplicative": for m £ C, not a singleton, write m = mim2 = mi U m2 with 
mi and m2 non-empty. Then we have^ 

where ipi (8> f2 is the function (y, z) ipi{y)ip2{z). For fixed ipi, we can express the inner sum 
as 



S{(pi (P2) 



X 



l3{x)Lp2{pm2{Fx))dll{x) 



with /3(x) = a{x)'fi{pm-^{Fx))), which is also supported on S{X,Q.;C* 
hypothesis applied first to m2, then to m-i, we obtain 

2 



By the induction 



(5{x)dp{x] 



X 



n 



Now the proof of Proposition 12 .41 is easy. 



i\m,-im2 



X 



a{x)dp{x) 



□ 



Proof of Proposition \2.4\ Take a{x) to be the characteristic function of S{X,Q; C*) and sum 
over m €z C the inequality of Lemma 12.71 since 



a{x)dfi{x) 



X 



a{xfdp{x) = |5(X,fi;£*)|, 



it follows that 



hence the result. 



□ 



Example 2.8. In the classical case, with Y = Z and Yg = Z/^Z, we can identity Y^ with 
Z/mZ by the Chinese Remainder Theorem. With i/(y) = l/£ for all i and all y, the usual basis 
of functions on Y^ is that of additive characters 

'ax^ 



ni 

for a G Z/mZ. It is easy to check that such a character belongs to if and only if a and m 
are coprime. 

At this point a "large sieve inequality" will be an estimate for the quantity A. There are 
various techniques available for this purpose; see jlKI Ch. VII] for a survey of some of them. 

The simplest technique is to use the familiar duality principle for bilinear forms or linear 
operators. Since A is the square of the norm of a linear operator, it is the square of the norm 
of its adjoint. Hence we have: 

Lemma 2.9. Let = {Y,A, (pe)) be a sieve setting, {X,p,F) a siftable set, C a sieve support 
associated to C* . Fix orthonormal basis Be and define Bm as above. Then the large sieve 
constant A{X, C) is the smallest number A such that 

(2-6) / |E E '^("^''^)'^(/'-(^-))r^'"(^)^^EEi'^("^''^)i' 

for all vectors of complex numbers {P{m,ip)). 

Here we use the enlargement of jC at the beginning to ensure that rrii £ jC. 
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The point is that this leads to another bound for A in terms of bounds for the "dual" sums 
W{ip, if') obtained by expanding the square in this inequality, i.e. 



W{ip,ip')= / ip{pm{F,W{pn{F,))dfi{x), 

Jx 

where (/? E Bm and (p' G Bn for some m and n in 5(A). Precisely, we have: 



Proposition 2.10. Let ^ = (Y,A,{pi)) be a sieve setting, T = (X,fi,F) a siftable set, C* a 
prime sieve support and C an associated sieve support. Then the large sieve constant satisfies 

(2.7) A^maxmaxj; \W{ip,ip')\. 

Proof. Expanding the left-hand side of 1)2. 6|1 , we have 

m^Cifii^BI;^ m,n ip,ip' 

and applying \uv\ ^ |(|iip + |f^|) the result follows as usual. □ 

The point is that sieve results are now reduced to individual uniform estimates for the "sums" 
W{<f,ip'). Note that, here, the choice of the orthonormal basis may well be very important in 
estimating W{(p,(p') and therefore A. 

However, at least formally, we can proceed in full generality as follows, where the idea is that 
in applications Pm{Fx) should range fairly equitably (with respect to the density Vm) over the 
elements of Ym, so the sum W{(p,Lp') should be estimated by exploiting the "periodicity" of 
fiPm{Fx))f'{pniFx)). To do this, we introduce further notation. 

Let m, n be two elements of S'(A), if £ Bm, 'f' G Bn- Let d = mPln be the intersection (g.c.d. 
in the case of integers) of m and n, and write m = m'd = m' U d, n = n'd = n' L) d (disjoint 
unions). According to the multiplicative definition of Bm and Bn, we can write 

for some unique basis elements cpm' S Bm', fd, v'd ^ and ip'^, £ Bn'- 

Let [m, n] = mn = mU n he the "l.c.m" of m and n. We have the decomposition 

^[ni,n] — ^m' ^ ^d ^ ^n' , 

the (not necessarily surjective) map p[m,n] '■ ^ ~^ ^[m,n] and the function 



(2.8) = fm' ® ifdf'd) ® f'n' ■ {yi,yd,y2) ^ fm'{yi)fd{yd)^'d{yd)^n'{y2), 

(which is not usually a basis element in B[m,n])- 

The motivation for all this is the following tautology: 

Lemma 2.11. Let m, n, ip and p' he as before. We have 



W,f']{p[ra,n]^y)) = fipm{y))p' {pn{y)) 

for all y gY , hence 

W{p,ip')= / [ip,ip']{p[m,n]{Fx))dn{x). 



X 



Now we can hope to split the integral according to the value of y = P[m,n]{Fx) in Y^m,n], and 
evaluate it by summing the main term in an equidistribution statement. 

More precisely, for d G ^(A) and y G Y^i, we define rd{X;y) as the "error term" in the 
expected equidistribution statement: 



(2.9) \{Pd{Fx) =y}\= dp{x) = Ud{y)\X\ + rd{X; y). 
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Then we can write W{ip, ip') as described before: 



X 



(2.10) =m{[ip,ip'])\X\ + 0(^ Yl \\[^,^']\Urim,n]{X;y)\^ 

after inserting (|2.9() . where the impHed constant is of modulus ^ 1 and 

the inner product in L^(Y[^ ,j]). One would then hope that m([(^, (/?']) is the delta-symbol 
5{{m^ip), {n,ip')) which would select the diagonal in the main term of the sums W{ip,ip'). In 
Sections H] and IHl we will see how to evaluate this quantity for the special case of group and 
coset sieves. But first, a short digression... 

3. The "dual" sieve 

The equivalent definition of the large sieve constant by means of the duality principle (i.e, 
Lemma 12. 9() is quite useful in itself. For instance, it yields the following type of sieve inequality, 
which in the classical case goes back to Renyi. 

Proposition 3.1. Let {Y,A,{pi)) be a sieve setting, {X,fi,F) a siftable set and C* a prime 
sieve support. Let A be the large sieve constant for C = C* ^ Then for any sifting sets {^i), we 
have 

(3.1) (P(x, C) - P{C)) ^dfiix) ^ AQ{C) 
where 

(3.2) P(x,£)= 1' PiC)=Y''i^i)^ QiC) = Y''i^^)i^-''i^i))- 

fe£ e&c 

Proof. By expanding the characteristic function xi^e) of 17^ C 1^ in the orthonormal basis Bi, 
we obtain 

P(x,£) = P(£) + Y m^^MPiiF.)), 

where 

Pi£,v)= YMvMy), 

and we used the fact that B"^ = Bi — {1} for ^ € A. Thus we get 

/ (P(x,£)-P(£))%(x)= / |J] ^ /5(A(/^)</^(p£(F.))|V(a:) 

^aYY 

by applying (|2.6() . Since we have 

this implies the result. □ 

^ Precisely, £■ is the set of singletons {£} for £ £ C* . 
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In particular, since P{x,C) = for x € S{X;Q., C*) and Q{C) ^ P{C), we get (by positivity 
again) the estimate 

\S{X;a,C*)\ ^ AP{£)-\ 

which is the analogue of the inequalities used e.g. by Gallagher in ^ Th. A], and by the 
author in |Kolj . This inequality also follows from Proposition 12.41 if we take C containing only 
singletons (in the arithmetic case, this means using only the primes), since we get the estimate 

\SiX, 0; C*)\ ^ Ai/-i with H = Y^ ^^F^ ^ E ^(^^) = ^(^) 

(in fact, by Cauchy's inequality, we have P{C)'^ ^ HQ{C)). 

This type of result is also related to Turan's method in probabilistic number theory. In 
counting primes with the classical setting, or more generally in "small sieve" situations, it may 
seem quite weak (it only implies t^{X) <^ X(log log X)^^). However, it is really a different type 
of statement, which has additional flexibility: for instance, it still implies that for X ^ 3 we 
have 

\{n^X I uj{n) < KloglogX}! < - — 

log log X 

for any k g]0, 1[, the implied constant depending only on k. This estimate is now of the right 
order of magnitude, and this shows in particular that one can not hope to improve by 
using information related to all "squarefree" numbers; in other words, Proposition 12.41 can not 
be extended "as is" to an upper bound for the variance on the left of (|3.1j) . 

These remarks indicate that ProDosition l3.1l has its own interest in cases where the "stronger" 
form of the large sieve is in fact not adapted to the type of situation considered. In Section [TUl 
we will describe an amusing use of the inequality ()3.1() . where the "pure sieve" bound would 
indeed be essentially trivial. 

4. Group and conjugacy sieves 

We now come to the description of a more specific type of sieve setting, related to a group 
structure on Y. Together with the coset sieves of Section 121 this exhausts most examples of 
applications we know at the moment. 

A group sieve corresponds to a sieve setting = (G, A, {pe)) where G is a group and the 
maps Pi : G ^ Gi are homomorphisms onto finite groups. A conjugacy sieve, similarly, is a 
sieve setting ^ = {G, A, (pe)) where pi : G — > is a surjective map from G to the finite set of 
conjugacy classes G^ of a finite group G^, that factors as 

G ^ G^ ^ G^ 

where G — > G^ is a surjective homomorphism. Obviously, if G is abelian, group and conjugacy 
sieves are identical, and any group sieve induces a conjugacy sieve. 

The group structure suggests a natural choice of orthonormal basis Bi for functions on Gi or 
G^, as well as natural densities f^. We start with the simpler conjugacy sieve. 

From the classical representation theory of finite groups (see, e.g., |S2j). we know that for 
any £ £ A, the functions 

y ^ Tr7r(y), 

on Gi, where tt runs over the set 11^ of (isomorphism classes of) irreducible linear representations 
IT : Gi ^ GL(T4-)) form an orthonormal basis of the space C{Gi) of functions on Gi invariant 
under conjugation, with the inner product 



{f,9) = iTTT E f(y)9iy)- 



' ^' ydGi 

Translating this statement to functions on the set G\ of conjugacy classes, this means that 
the functions 

(^(ytt) = Tr7r(yS) 
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on form an orthonormal basis Be of L^(G^) with the inner product 

Moreover, the trivial representation 1 of Ge has for character the constant function 1, so we 
can use the basis B^ = (Tr7r(y''))7r for computing the large sieve constant if the density 

My 



\Gi\ 

is used. Note that this is the image on of the uniform density on G^. 

Note also that in the abclian case, the representations are one-dimensional, and the basis 
thus described is the basis of characters of G^, with the uniform density, i.e., that of group 
homomorphisms — ^ with 

Coming back to a general group sieve, the basis and densities extended to the sets 

for m G S{A) have a similar interpretation. Indeed, Gm identifies clearly with the set of 
conjugacy classes of the finite group Gm = TIG£. The density Uju is therefore still given by 

Also, it it well-known that the irreducible representations of Gm are of the form 

for some uniquely defined irreducible representations tt^ of Gi, where lEl is the external tensor 
product defined by 

e\m 

In other words, the set 11^ of irreducible linear representations of Gm is identified canonically 
with nn^- Moreover, the character of a representation of Gm of this form is simply 

TrTrig)=llTrTriigi), 

so that the basis Bm obtained from B^ is none other than the basis of functions y" i— > Tr7r(?/'') 
for vr ranging over Ilm- 

Given a siftable set (X, F) associated to a conjugacy sieve (G, A, (pe)), the sums W{ip, (p') 
become 

(4.1) W{7r,r)= [ TT7T{pm{R,))TTT{pn{R.))dfi{x) 

Jx 

for irreducible representations tt and r of Gm and G„ respectively, which can usually be in- 
terpreted as exponential sums (or integrals) over X, since the character values, as traces of 
matrices of finite order, are sums of finitely many roots of unity. 

We summarize briefly by reproducing the general sieve results in this context, phrasing things 
as related to the conjugacy sieve induced from a group sieve (which seems most natural for 
applications). In such a situation, the sieving sets are naturally given as conjugacy-invariant 
subsets of Gi, and are identified with subsets $7^ of G^. Note that we have then f^(O^) = 
m/\Ge\. 
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Proposition 4.1. Let {G, A, (pi)) be a group sieve setting, {X,fi,F) an associated siftable set. 
For any prime sieve support C* and an associated sieve support C satisfying (|2.1() . and for any 
conjugacy invariant sifting sets {O,^), we have 

\S{X,n;£*)\ ^ ^H-^ 
where A is the smallest non-negative real number such that 

Y.12\[ aix)Trnipm{F^))dfiix)\\Af \aix)\^dp{x) 

for all square-integrable function a G Lp'{X,p), where vr ranges over the set of primitive 
irreducible linear representations of Gm, i-e., those such that no component vr^ for i \ m is 
trivial, and where 



^=En 



Moreover we have 



A < max max > > iM^fvr, r)|, 



where 



Vr(7r,r)= / Tr7r(p^(F^))TVr(p„(F^))d^(x). 
JX 



The general sieve setting can also be applied to problems where the sieving sets are not 
conjugacy-invariant, using the basis of matrix coefficients of irreducible representations. Let 
{G.,A.,{p()) be a group sieve setting. For each I and each irreducible representation vr € XI^, 
choose an orthonormal basis {e-j^^i) of the space of the representation (with respect to a Gi- 
invariant inner product (-j-)^). Then (see, e.g., jKnj §1.5], which treats compact groups), the 
family Bi of functions of the type 



¥^77,6,/ : X ^ Vdim7r(7r(x)e, f)^,, e = e^^^i, . . . , e^,^, / = e^,i, . . . , e^,^ 
is an orthonormal basis of L?'{Gi) for the inner product 

i.e., corresponding to the density yi{x) = 1/\G(\ for all x £ Ge. Moreover, for vr = 1, and an 
arbitrary choice of e £ C with |e| = 1, the function (pi,e,e G Si is the constant function 1. 

If we extend the basis to orthonormal basis Bm of L'^{Gm) for all m £ S{A), by multi- 
plicativity, the functions in Bm are of the type 

V'TT.ej : (xi) ^ V dim vr {TTe{xe)ei, Z^)^^ 

e\m 

where e = ^ei and / = 0/^ run over elements of the orthonormal basis 

((S) ^-^I'ii) ' 1 ^ii< dimvr^, 

i\m 

constructed from the chosen bases {ej^^i) of the components, the inner product on the space of 
KlvTf being the natural Gm-invariant one. 

The sums W{ip,ip') occurring in Proposition 12.101 to estimate the large sieve constant are 
given by 



(4.2) VF(y^.,e,/,^r,e',/') = ^(dim tt) (dim r) / (^(p„(F,))e, /),(T(p„(F,))e', /')rd/^(x). 



X 
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If we apply Lemma [2.111 to elements ^■K,e,fi ^T,e'j' of the basis Bm and Bn of L'^{Gm), the 
function [(/Jt^, W] which is integrated can be written as a matrix coefficient of the representation 

(4.3) [vr, f ] = TTm' K (vTrf 7^) K 

of , where we write vr = vr^/ Klvr^, r = r„' Klr^^, with the obvious meaning of the components 

T^m', T^d-, Td^ ^n' , and the bar indicates taking the contragredient representation. 
Indeed, we have 

[^■K,ej,'PT,e',f']{xe) = V (dim vr) (dim r) ( [vr, f] {p[m,n] {Fx))e, f) [n,f] 

for (xe) G , with e = e O e', / = / /'. 

Concretely, this means that in order to deal with the sums W{ip,ip') to estimate the large 
sieve constant using the basis Bm of matrix coefficients, it suffices to be able to estimate all 
integrals of the type 

(4.4) / {w{F,)e,f)^dfi{x). 

Jx 

where zu is a representation of G that factors through a finite product of groups Gi, and e, / 
are vectors in the space of the representation vj (the inner product being G-invariant). See the 
proof of Theorem 11.31 for an application of this. 

Remark 4.2. Another potentially useful sieve setting associated to a group sieve setting (G, A, pi) 
is obtained by replacing pi with the projections G ^ Gi ^ GgjKi = for £ S A, where Kf 
is an arbitrary subgroup of G^. Considering the density on Yn which is image of the uniform 
density on G^, an orthonormal basis Bi of l?i^e) is then obtained by taking the functions 

where vr runs over irreducible representations of G^, e runs over an orthonormal basis of the 
X^-invariant subspace in the space 14- of vr, and / over a full orthonormal basis of 

Indeed, the restriction on e ensures that such functions are well-defined on GijKi (i.e., the 
matrix coefficient is /C^-invariant), and since those are matrix coefficients, there only remains 
to check that they span l^i^i). However, the total number of functions is 

^ (dimvr)(Resg vr, \)k, = ^ (dimvr)(vr, Indg 1)g, = dimlndg 1 = 

TT TT 

and since they are independent, the result follows. 

Because this basis is a sub-basis of the previous one, any estimate for the large sieve constant 
for the group sieve will give one for this sieve setting. 

5. Elementary and classical examples 

We first describe how some classical uses of the large sieve are special cases of the group sieve 
setting of the previous section, and conclude this section with a "new" example of the general 
case which is particularly easy to analyze (and of little practical use), and hence somewhat 
enlightening. 

Example 5.1. As already mentioned, the classical large sieve arises from the group sieve setting 

^ = (Z, {primes}, Z ^ Z/iZ) 

where the condition for an additive character x e{ax/m) of Gm = (Z/mZ) to be primitive is 
equivalent with the classical condition that {a,m) = 1. 
In the most typical case, the siftable sets are 

X = {n^l I N i^n<N + M} 

with Fx = X, and the abstract sieving problem becomes the "original" one of finding integers 
in X which lie outside certain residue classes modulo some primes i. 
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More generally, take 

^ = (Z^ {primes}, Tl ^ (Z/£Z)^) 

(the reduction maps) and X = {(ai, . . . , Ur) \ Ni ^ ai < Ni + Mi}, with F the identity map 
again. Then what results is the higher-dimensional large sieve (see e.g. [H]). 

For completeness, we recall the estimates available for the large sieve constant in those two 
situations, when we take C* to be the set of primes ^ L, and C to be the set of squarefree 
integers ^ L, for some L ^ 1. We write S{X;0,, L) instead of S{X;0,, C*). 

Theorem 5.2. With notation as above, we have A ^ N—l + L"^ forr = 1 and A ^ {VN + L)'^^ 
for all r ^ 1. In particular, for any sieve problem, we have 

\S{X;Q,L)\ ^ {N -1 + L^)H-^, if r = 1, 

\S{X-n,L)\ i^{VN + Lf''H-\ ifr^l, 

where 

the notation ^ indicating a sum restricted to squarefree numbers. 

In the one-variable case, this is due essentially to Montgomery, and to Selberg with the 
constant A'" — 1 -|- L^, see e.g. |[K, §7.5]; the higher-dimensional case as stated is due to Huxley, 
see |Huj . Note that modern treatments deduce such estimates from an analytic inequality which 
is more general than the ones we used, namely, the inequality 



M<n^M+N 



for arbitrary sets (^r) of elements in R/Z which are 5-spaced, i.e., the distance d{(,r,(,s) in R/Z 
is at least 5 if r 7^ s (this was first considered by Bombieri and Davenport; see, e.g., |IK[ Th. 
7.7]). This amounts, roughly, to considering sums 



M<n^M+N 

where vr^ : n 1-^ e{nS,r) and tTs are representations of G = Z which do not factor through a finite 
index subgroup. This suggests trying to prove similar inequalities for general groups sieves, i.e., 
essentially, consider integrals 1)4. 4|) for arbitrary (unitary) representations zu of G. 
Note that for r = 1, the equidistribution assumption 1)2.91) becomes 

l = ^+r,{X;y), 

N^n<N+M 
n=y (mod d) 

which holds with \rii{X;y)\ ^ 1 for any y G Z/dZ. From ()6.9j) we obtain the estimate A ^ 
N + L^, which is by no means ridiculous. (See Section El for the computation of the quantity 
m{[ip,ip']) for group sieves, or do the exercise). 

Classical sieve theory is founded on such assumptions as (|2.9|) . usually stated merely for 
y = 0, and on further assumptions concerning the resulting level of distribution, i.e., bounds for 
rrf(X; 0) on average over d in a range as large as possible (compared with the size of X). More 
general bounds for rd{X; y) do occur however. 

Note that, even if this is classical, the general framework clearly shows that to sieve an 
arbitrary set of integers X C {n \ n ^ 1} C Z, it suffices (at least up to a point!) to have 
estimates for exponential sums 

/ax bx\ 
^ \m n J 
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with n, m squarefree and {a^m) = (J},n) = 1. It suffices, in particular, to have equidistribution 
of X in (all) arithmetic progressions. This means for instance that some measure of large sieve is 
usually doable for any sequence for which the classical "small" sieves work. This is of particular 
interest if X is "sparse", in the sense that e.g. X C {n \ N < n ^ 2N} for some with \X\/N 
going to zero. 

It would also be interesting, as a problem in itself, to investigate the values of the large sieve 
constant when using other sieve support than squarefree integers up to L, for instance when 
the sieve support is the support of a combinatorial (small) sieve. 

Example 5.3. Can the multiplicative large sieve inequality for Dirichlet characters be related 
to our general setting? Indeed, in at least two ways. First, let g ^ 2 be given, let G be the 
multiplicative subgroup of generated by primes p > q, and take 

^ = (G, {primes £>q},G^ (Z/£Z)^ = Gg). 

In that context, we can take 

X = {n ^ N I p \ n ^ p > q}, 

and Fx = x, and if C* is the set of primes ^ L ^ q, and C is the set of squarefree numbers ^ L, 
the sifted sets become 

S{X; n,L) = {n!^N \ {p \ n ^ p > q) and n {modi) ^ fli for £ L q}, 

where ili C (Z/^Z)^ . A simple check shows that the inequality defining the large sieve constant 
A becomes 

for any complex numbers a„, where x runs over primitive characters modulo m, and hence 
A ^ — 1 + by the multiplicative large sieve inequality (see e.g. |IK[ Th. 7.13]). 

Alternately, if we allow the density I'e to have zeros, we may take the classical sieve setting 
Y = Z,Y ^Yi = Z/£Z, X = {N < n M + N}, Fx = X, with density 

and then check that since the final statements do not involve the inverse of I'eiy), although the 
proofs involved division by i^iiy), it remains true that for 0,£ C Z^ — {0}, we have 

\S{X,n;L)\ AH-^ 

where 

m^L £\m 

and A is the multiplicative large sieve constant defined by (|5.H) (e.g., use a positive perturbation 
^e.,e{y) > of the density so that Hg, — > H and A^ — > A, as e ^ 0). Again, we have A ^ 
N -l + L"^. 



Example 5.4. Serre jS3j has used a variant of the higher-dimensional large sieve where 

* = (Z", {primes}, Z'^ ^ {Z/fzy) 

and 

X = {{xi,...,Xr) e Z'' I \xi\ < N] 

with Fx = X. With suitable sifting sets, this provides estimates for the number of trivial 
specializations of elements of 2-torsion in the Brauer group of Q(Ti, . . . , T^). 
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Example 5.5. Here is a new example, which is a number field analogue of the situation 
of |Kolj (described also in the Section [TT|) . It is related to Serre's discussion in [S2| of a higher- 
dimensional Chebotarev density theorem over number fields (see also [P] for an independent 
treatment with more details). Let y/Z be a separated scheme of finite type, and let ^ y be 
a family of etale Galois coverings,^ corresponding to surjective maps G = vri (y,r/) Gg. The 
sieve setting is {G, {primes}, G — > Gg). Now let \Y\ denote the set of closed points of Y, which 
means those where the residue field k{y) is finite, and let 

X = {ye\Y\ I \k{y)\ ^ T} 

for some T ^ 2, which is finite. For y G X, denote hy Fx G G the corresponding geometric Frobe- 
nius automorphism (or conjugacy class rather) to obtain a siftable set {X, counting measure, F) 
associated with the conjugacy sieve. It should be possible to obtain a large sieve inequality in 
this context, at least assuming GRH and the Artin conjecture. 

Note that if Y is the spectrum of the ring of integers in some number field (or even Y = 
Spec(Z) itself), this becomes the sieve for Frobenius considered by Zywina j.Z^, with applications 
(under GRH) to the Lang- Trotter conjecture, and to Koblitz's conjecture for elliptic curves over 
number fields. 

Example 5.6. The next example illustrates the general sieve setting, showing that it includes 
(and extends) the inclusion-exclusion familiar in combinatorics and probability theory, and also 
that the large sieve inequality is sharp in this general context (i.e., there may be equality 
\S{X,n;C*)\ = AH~^). 

Let (Q,I],P) be a probability space and Ai C S, for i £ A, a countable family of events. 
Consider the event 

A = {u; Gil I a; ^ ^4^ for any £ G A}. 

For m G S{A), denote 

Am= f] Ae, Adt = n. 
If A is finite, which we now assume, the inclusion-exclusion formula is 

p(^)= (-i)'"^'p(^-), 

m65(A) 

and in particular, if the events are independent (as a whole), we have 

V{Am) = ^'P{Ai), and V{A) = ^{l -^{A,)). 

Take the sieve setting (fi. A, 1^^), where 1b is the characteristic function of an event B, with 
Yi = {0, 1} for all i, and the siftable set (r2,P,Id). Choose the density = l^^(P), i.e., put 

z.,(l) = P(A,), i.,(0) = 1-P(^,). 

With sieving sets Qi = {1} for ^ G A, we have precisely S{X, A) = A. 
The large sieve inequality yields 

P(^) ^ AH-^ 

where 



and A is the large sieve constant for the sieve support £, which may be any collection of subsets 
of A such that {£) G £ for ah I G k. 



Or better with "controlled" ramification, if not etale, since this is likely to be needed for some natural 
applications. 
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Coming to the large sieve constant, note that Lq{Y£) is one-dimensional for all £, hence so is 
LQ{Ym) for all m (including m = 0). The basis function (pi for L'^{Y£) (up to multiplication by 
a complex number with modulus 1) is given by 

/X y-pe 



where pe = ^{Ai) for simplicity, so that 

1a, - PiAe] 



and in particular 

E(99,(1a,)) = {ipi, 1) = 0, EiifeiUf) = Wifef = 1. 
Hence, for £, £' G A, W{ipi, 99^/) is given by 

and it is (by definition) the correlation coefficient of the random variables Ia, and 1 4 , , explicitly 



•f if 
W{^e, ^e) = { P(^£ n Ae) - P{A^)P{Ae, 



otherwise. 



If (and only if) the {Ai) form a family of pairwise independent events, we see that W{ip£, ip£i 
6{l,l'). More generally, in all cases, for any m, n C A, we have 

which is a multiple normalized centered moment of the 1^^. 
If the (Ai) are globally independent, we obtain 



= 6{m, n) 

(since the third factor vanishes if the product is not empty, i.e., if m 7^ n, and the third term is 
1 by orthonormality of 99^). It follows by (|2.7() that A ^ 1, and in fact there must be equality. 
Moreover, in this situation, if C contains all subsets of A, we have 

'^=n(i+r^)=nr^. 

so that we find 

^H-' ^\{{l-P{A,))=P{A), 

£6A 

i.e., the large sieve inequality is an equality here. 

Similarly, the inequality 1)3. If) becomes an equality if the events are pairwise independent, 
and reflects the formula for the variance of a sum of (pairwise) independent random variables. 

In the general case of possibly dependent events, on the other hand, we have a quantitative 
inequality for P{A) which may be of some interest (and may be already known!). In fact, we 
have several possibilities depending on the choice of sieve support. It would be interesting to 
determine if those inequalities are of some use in probability theory. 

To conclude this example, note that any sieve, once the prime sieve support C* and the 
sieving sets {Vti) are chosen, may be considered as a similar "binary" sieve with Yi = {0, 1} for 
all £, by replacing the sieve setting {Y,k, (pe)) with {Y,C*, In,)- 
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Example 5.7. There are a few examples of the use of simple sieve methods in combinatorics. 
An example is a paper of Liu and Murty |LMj (mentioned to us by A. Granville) , which explores 
(with some interesting combinatorial applications) a simple form of the dual sieve. Their sieve 
setting amounts to taking ^ = {A,B, 1^) where A and B are finite sets, and for each b € B, 
we have a map lb : A ^ {0, 1} (in loc. cit., the authors see {A,B) as a bipartite graph, and 
Ife(a) = 1 if and only if there is an edge from a to 6); the siftable set is A with identity map 
and counting measure, and the density is determined by t'fe(l) = |1^^(1)|/|^|. In other words, 
this is also a special case of the previous example, and Theorem 1 and Corollary 1 of loc. cit. 
can also be trivially deduced from this (though they are simple enough to be better considered 
separately) . 

6. COSET SIEVES 

Our next subject is a generalization of group sieves, which is the setting in which the Frobenius 
sieve over finite fields of |Kolj and Section ITTI operates. 

As in Section |1J we start with a group G and a family of surjective homomorphisms G ^ Gi, 
for i £ A, onto finite groups. However, we also assume that there is a normal subgroup G^ of 
G such that the quotient G/G^ is abelian, and hence we obtain a commutative diagram with 
exact rows 



(6.1) 



1 



G9 



Lt/, 



G 



Gf 



G/G9 



1, 



where the downward arrows are surjective and the quotient groups thus defined are finite 
abelian groups. 

After extending the definition of Gm to elements of 5" (A) by multiplicativity, we can also 
define 

l\m. 

and we still can write commutative diagrams with exact rows 

d 



1 



(6.2) 



G9 



r<9 



G 



Gr 



G/G9 



1, 



(but the downward arrows are no longer necessarily surjective). 

The sieve setting for a coset sieve is then (Y", A, {pe)) where Y is the set of G-conjugacy classes 
in d~^{a) for some fixed a € G/G^ . Since G^ is normal in G, this set is indeed invariant under 
conjugation by the whole of G (this is an important point). We let be the induced map 

y ^ = {y» G G; I d(y«) = p{a)} C G\. 
The natural density to consider (which arises in the sieve for Frobenius) is still 



My 



and hence 



for a conjugacy class yK Note that this means that for any conjugacy- invariant subset fti C Gg, 
union of a set of conjugacy classes such that 0^ C d^^{p{a)) =Ye, we have 



\GI\ 



We turn to the question of finding a suitable orthonormal basis of L'^{Y£, ui). This is provided 
by the following general lemma, which applies equally to H = Gi and to H = Gm for m G S{A). 
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Lemma 6.1. Let H be a finite group, a subgroup with abelian quotient T = H/H^ . Let 
a gT and Y the set of conjugacy classes of G with image a in T. 

For an irreducible linear representation vr of H, let ip-,^ be the function 

: 1-^ Tr7r(y'') 

on HK 

(1) For IT, T irreducible linear representations of H, we have 



(6.3) ifTTj^r) 



0, if either ip^^ \ ^ ip-^ \ or (p-,^ \ Y = 



1 "0(0)11^ '^l, where ip gV satisfies vr ~ r, otherwise, 
where T is the group of characters of T , = \ vr ~ vr ® -;/'}, and the inner product is 



\H9\ 



(2) Let B be the family of functions 



restricted to Y , where vr ranges over the subset of a set of representatives for the equivalence 
relation 

vr ~ r f/ and only if it \ ~ r | , 

consisting of those representatives such that ip.,^ \ Y ^ 0. Then B is an orthonormal basis of 
L'^{Y) for the above inner product. 

In the second case of (|6.3)) , the existence of the character ip will follow from the proof below. 

Proof. We have 

{f7T,fr) = Trvr(y)Tr r(y) 

yen 

d(y)=a 

= 7^1^ {Y Tr7r(y)Trr(y) 
= ^ tp{a){'7T (gi Tp, t)h = Yj V'(")<J(7I" ®'>P,'t), 

by orthogonality of characters of irreducible representations in L^{H). 

First of all, this is certainly zero unless there exists at least one ^ such that it <Si i/j — t. In 
such a case we have vr | ~ r | since C ker{ijj), so we have shown that the condition 
vr I 9^ T I implies that the inner product is zero. 

Assume now that vr | ~ r | H^; then repeating the above with a = 1 (i.e., Y = H^), it 
follows from {tt,t)h9 7^ that there exists one ip at least such that tt ® = t. 

Fixing one such character ipQ, the characters tp' for which tt ip' t are given by ip' = ipipQ 
where tp gV" . Then we find 

{'P^tv-.^t) = ^ V'(a)(5(vr (gi V^Tr (g) ^/^o) = ipo{a) ^ ip{a). 

For any ■0 S F'^ and y" S y, we have the character relation 

Trvr(y'') = 'ip{y^)TnT{y^) = ip{a)TnT{y^), 

hence either "0(0) = 1 for all ip, or Trvr(y'') = for all y", i.e., (/Jtt restricted to Y vanishes. In 
this last case, we have trivially 99^ = also on Y , and the inner product vanishes. 
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So we are led to the last case where tt | = r | but 'ip{a) = 1 for all ^ijj G . Then the 
inner product formula is clear from the above. 

Now to prove (2) from (1), notice first that the family is a generating set of L'^iY) (indeed, 
all (^7r generate L'^{H^), but those vr for which ipj^ = on Y are clearly not needed, and if vr ~ r, 
we have = t{j{a)(pTr on Y, where tp satisfies r ~ V "S^ so one element of each equivalence 
class suffices for functions on Y) . Then the fact that we have an orthonormal basis follows from 
the inner product formula, observing that if r ~ vr (8> Vi ^6 have in fact vr = r by definition of 
the equivalence relation, so '0 = 1 in (|(i.3|) . □ 

Example 6.2. In this lemma we emphasize that distinct representations of H may give the 
same restriction on H^, in which case they correspond to a single element of the basis, and that 
it is possible that a (^^r vanish on Y, in which case their representative is discarded from the 
basis. 

Take for instance G = Dn, a dihedral group of order 2n. There is an exact sequence 

1 ^ Z/nZ -^G Z/2Z ^ 1 

and if y = (i~^(l) C G and vr is any representation of G of degree 2, we have Tr7r(x) = for 
ah X G y (see e.g. [SI 5.3]). 

In particular, note that even though both cosets of Z/nZ in G have four elements, the sets of 
conjugacy classes in each do not have the same cardinality (there are 5 conjugacy classes, 3 in 
kerd and 2 in the other coset). In other words, in a coset sieve, the spaces Ym usually depend 
on the value of a (they are usually not even isomorphic). 

If we apply Lemma UTTI to the groups Gm and their subgroups Gm-, we clearly obtain orthonor- 
mal bases of L?'{Ym) containing the constant function 1, for the density Ura above. Although it 
was not phrased in this manner, this is what was used in |Kolj (with minor differences, e.g., 
the upper bound k for the order of FJ^ that occurs in loc. cit., and can be removed - as also 
noticed independently by Zywina in a private email). 

As before we summarize the sieve statement: 

Proposition 6.3. Let G he a group, G^ a normal subgroup with abelian quotient, pi : G ^ Gi 
a family of surjective homomorphisms onto finite groups. Let {Y,A,{p£)) be the coset sieve 
setting associated with some a G G/G^, and {X,p,F) an associated siftable set. 

For m £ S{A), let Hm be a set of representatives of the set of irreducible representations of 
Gm modulo equality restricted to Gm, containing the constant function 1. Moreover, let 11^ be 
the subset of primitive representations, i.e., those such that when vr is decomposed as Kl^i^vr^, 
no component iri is trivial, and Trvr^ is not identically zero on Yg. 

Then, for any prime sieve support C* and associated sieve support C, i.e., such that (|2.1|) 
holds, and for any conjugacy invariant sifting sets {Vti) with Vti (ZYi for £ G £*, we have 

\s{x,VL-c*)\ Ai^-i 

where A is the smallest non-negative real number such that 



\[ aix)Tr7r{pm{F^))dp{x)\\Af \a{x)\^dfi{x) 
for all square-integrable function a G L'^{X,p), and where 



Moreover we have 
(6.4) A ^ max max ^ ^ \W{tt,t) 
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W{tt,t) = ^=L= [ TrTT{p^{F^))TiT{pn{F^))dfiix). 



where 

W{n,T) = — 

\ \T'^ lir , 

We now consider what happens of the equidistribution approach in this context. (Some of 
this also apphes to group conjugacy sieves, where = G). 

If we apply Lemma [2. Ill to the elements ifT^, ^Pr of the basis Bm and Bn of L'^{Ym), we see 
that the function [99,^7 W] defined in (|2.8j) is the character of the representation 

of G[m,n]j already defined in (|4.3|1 . Hence we have 

(6.5) W{7r,T) = [TT{[7r,f]p[^.^]{E,))dfi{x). 

■PVT I |"pT I 



X 



In applications, this means that to estimate the integrals W{tt,t) it suffices (and may be 
more convenient) to be able to deal with integrals of the form 



IX 

where is a representation of G that factors through a finite product of groups G^ (see Sectional 
for an instance of this). 

If we try to approach those integrals using the equidistribution method, then the analogue 
of ifT^ is the identity 

(6.6) |{p,(F,) = ytt}|= / d/i(x) = ^|X|+rrf(X;y«), 

defining rd{X;y^) for € Y^. Then (|2.1()j) becomes 

1^1 /I 
iy(7r,r) = — ' ni([7r,f]) +Of^_ ^ dim[7r, f]|r[^,„](X; y« 

where, comparing with Lemma l6. II with H = we have 

(6.7) mil-K.f]) = {ip^,ipr) 

where the inner product is in L^(yi„ „]) and both vr and r are extended to (irreducible) repre- 
sentations of G[m,n] by taking trivial components at those I S [m, n] not in m or n respectively. 
Hence by 1)6. 3|) . we have m([7r, r]) = unless vr and r thus extended are isomorphic restricted to 
^fmn]' ^^ich clearly can occur only if m = n and then if vr = r by orthogonality of i3*. When 

(m, vr) = (n, r), the inner product is equal to |r^| by 1)6. 3(1 . 
Using this and (|2.1U|) . we get 

Wi7T,T)=6{7T,T)\X\+0(^ dim[7r,f]|r[^,„](X;y«)|), 

where the implied constant is ^ 1. Hence for any sieve support C, the large sieve bound of 
Proposition I2.1()1 holds with 

(6.8) A(^\X\+R{X,C) 
where 

(6.9) i?(X;£) =maxmax{^ ^ ^ dim[7r,f]|r[^,„](X;y»)|}. 
For later reference, we also note the following fact: 
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Lemma 6.4. Let m, n in 5(A), vr G IIJ^, r G 11* . T/ie multiplicity of the trivial representation 
in the restriction of [vTjf] to G^^^j is equal to zero if {m,TT) 7^ (?^, t), antf is equal to |r^| i/ 
(m,7r) = (n,T). 

Proof. This multiplicity is by definition ([vr,f], 1) computed in L^(G^^^j), i.e., it is ((^ttiV^t) in 
L^(Y[^ „]) in the case a = 1 G G/G^ (with the same convention on extending vr and r to G^^^^] 
as before). So the result is a consequence of Lemma l6.ll □ 

7. Degrees and sums of degrees of representations of finite groups 

This section is essentially independent from the rest of the paper, and is devoted to proving 
some inequalities which are likely to be useful in estimating quantities such as ()6.4() or R{X, C) 
in 1)6. 9|1 . Indeed, we will use them later on in Section El and Section ITTl 

In practice, the bound for the individual exponential sums W{'k,t) is likely to involve the 
order of the groups G and the degrees of its representations, and their combination in (|6.4() will 
involve sums of the degrees. For instance, in the next sections, we will need to bound 



ax|(dim7r)^|G[^^„]| ^ (dimr)|, 
n ren* 

max<(dim7r)7 7 (dimr) >. 

m,TT L ^ — ^ ^ — ^ J 



ren* 

In applications, the groups G^ are often (essentially) classical linear groups over F^, but they 
are not entirely known (it may only be known that they have bounded index in GL{n, F^) as i 
varies, for instance^ see |Kolj and Section [TUl . Our results are biased to this case. 

For a finite group G and p G [1, +00], we denote 

Ap{G) = (j2dim{p)Py^^, if p / +00, A^{G) = max{dim(p)} 
p 

where p runs over irreducible linear representations of G (in characteristic zero). For example, 
we have A2{G) = \/\G\ for all G and if G is abelian, then Ap{G) = |G|^/^ for all p. Moreover 

lim Ap{G)=A^{G). 

p—^+O0 

We are primarily interested in Ai{G) and Aoo(G), but A^/2{G) will also occur in the proof of 
Theorem 11.31 and other cases may turn out to be useful in other sieve settings. We start with 
an easy monotonicity lemma. 

Lemma 7.1. Let G be a finite group and H <Z G a subgroup, p G [1, +00]. We have 

Ap{H) ^ Ap{G). 

Proof. For any irreducible representation p of H, choose (arbitrarily) an irreducible representa- 
tion it{p) of G that occurs with positive multiplicity in the induced representation Ind^ p. 
Let vr be a representation of G in the image of /) 1— > 7r(p). For any p where it{p) = n, we have 

{p, Resg 7r)H = (Indg p, t:)g > 0, 

by Frobenius reciprocity, i.e., all p with tt{p) = vr occur in the restriction of vr to H. Hence for 
p / +00 we obtain 

dim{p)P ^ ( dim(p))^ ^ dim(^)P, 

p p 

and summing over all possible Tr{p) gives the inequality 

ApiH)P ^ ApiGf 

by positivity. This settles the case p 7^ +00, and the other case only requires noticing that 
dim(p) ^ dim(7r(p)) ^ A^dG). □ 
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We come to the main result of this section. The terminology, which may not be familiar to 
all readers, is explained by examples after the proof. We hope that there will be no confusion 
between p and the characteristic of the finite field Fg which occurs... 

Proposition 7.2. Let G/Fg be a split connected reductive linear algebraic group of dimension 
d and rank r over a finite field, with connected center. Let W be its Weyl group and G = G(Fg) 
the finite group of rational points of G. 

(1) For any subgroup H C G and p G [1, +oo], we have 



with the convention r/p = if p = +oo, in particular the second factor is = 1 for p = +oo. 
(2) If G is a product of groups of type A or C, i.e., of linear and symplectic groups, then 



The proof is based on a simple interpolation argument from the extreme cases p = 1, p = +oo. 
Indeed by Lemma l7. II we can clearly assume H = G and by writing the obvious inequality 



we see that it suffices to prove the following: 

Proposition 7.3. Let G/Fg be a split connected reductive linear algebraic group of dimension 
d with connected center, and let G = G{Fq) be the finite group of its rational points. Let r be 
the rank of G . Then we have 



(7.1) A^{G)^^^^^{q + l)^^-^y\ and A,{G) ^ {q + 1)^^+^)/' (l +'^-^) , 
(q-ly V 0-1/ 



where Upi denotes the prime-to-p part of a rational number n, p being the characteristic of Fg. 
Moreover, if the principal series of G is not empty^ , there is equality 



and dim/) = Ao^[G) if and only if p is in the principal series. 

Finally if G is a product of groups of type A or G, then the factor (1 + 2r|Ty|/((7 — 1)) may 
be removed in the bound for Ai{G) . 

It seems very possible that the factor (1 + 2r|Ty|/(q — 1)) could always be removed, but we 
haven't been able to figure this out using Deligne-Lusztig characters, and in fact for groups of 
type A or C, we simply quote exact formulas for Ai{G) due to Gow, Klyachko and Vinroot, 
which are proved in completely different ways.^ The extra factor is not likely to be a problem 
in many applications where q +oo, but it may be questionable for uniformity with respect 
to the rank. 

The ideas in the proof were suggested and explained by J. Michel. 

Proof. This is based on properties of the Deligne-Lusztig generalized characters. We will mostly 
refer to DM] and |Caj for all facts which are needed (using notation from |DMj . except for 
writing simply G for what is denoted there). We identify irreducible representations of G 
(up to isomorphism) with their characters seen as complex-valued functions on G. 

First, for a connected reductive group G/Fg over a finite field, Deligne and Lusztig have con- 
structed (see e.g. |DM| 11.14]) a family R^{0) of generahzed representations of G = G(Fq) (i.e., 
linear combinations with integer coefficients of "genuine" representations of G), parametrized 
by pairs (T, 9) consisting of a maximal torus T C G defined over Fg and a (one-dimensional) 

^ In particular if q is large enough given G. 

''' The "right" upper bound for the case of groups of type A (i.e, GL{r)) may be recovered using the structure 
of unipotent representations of such groups. 




Ap{H) ^ (g + l)(^-'')/2+^'/P. 



MGf = 5^dim(p)f ^ A^{Gr-'A,{G) 



p 
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character 9 of the finite abehan group T = T(Fg). The R^{6) are not all irreducible, but 
any irreducible character occurs (with positive or negative multiplicity) in the decomposition 
of at least one such character. Moreover, R^{9) only depends (up to isomorphism) on the 
G-conjugacy class of the pair (T,^). 

We quote here a useful classical fact: for any T we have 

(7.2) i^q-iy ^\T\^{q+ir 

(see e.g. |DM1 13.7 (ii)]), and moreover \T\ = {q — lY if and only if T is a split torus (i.e., 
T ~ over Fq). Indeed, we have 

|r| = |det(g"-w I Yo)\ 

where w G W is such that T is obtained from a split torus Tq by "twisting with w" (see e.g. |('a| 
Prop. 3.3.5]), and Yq ~ Z*" is the group of cocharacters of Tq. If Ai, . . . , A,, are the eigenvalues 
of w acting on Yq, which are roots of unity, then we have 

r 

\T\=l[iq-\), 

i=l 

and so |r| = {q — lY if and only if each Aj is equal to 1, if and only if w acts trivially on Yq, if 
and only if w = 1 {W acts faithfully on Yq) and T is split. 

As in |DM1 12.12], we denote by p i-^ p{p) the orthogonal projection of the space C{G) 
of complex- valued conjugacy-invariant functions on G to the subspace generated by Deligne- 
Lusztig characters, where C{G) is given the standard inner product 

if, 9) = T7^y] f{x)g{x), 

and for a representation p, we of course denote p{p) = p(Tr p) the projection of its character. 

For any representation p, we have dim(/9) = dim(p(/9)), where dim(/), for an arbitrary func- 
tion / S C{G) is obtained by linearity from the degree of characters. Indeed, for any / standard 
character theory shows that 

dim(/) = (/, regc) 

where reg^ is the regular representation of G. From DM, 12.14], the regular representation is 
in the subspace spanned by the Deligne-Lusztig characters, so by definition of an orthogonal 
projector we have 

dim(/3) = (p,regG) = (p(p),regG) = dim.{p{p)). 

Now because the characters R^{6) for distinct conjugacy classes of (T, 9) are orthogonal (see 
e.g. |DM1 11.15]), we can write 



Pip) = E 77;?e^?7^cT77^^T(^) 

(T,( 

(sum over all distinct Deligne-Lusztig characters) and so 



^ {R^{9),R^{9Y^ 



dim(p(p)) = V .if;5iolL. dim(i?g(g)). 



I^^^{R^{9),R^{9Y 
By (DMI 12.9] we have 

(7.3) dim{R^{9)) = eGeT\G\p>\T\-\ 

where eg = {—^Y = (— 1)'''-'^^ ^C^) being the F^-rank of T (see |DM| p. 66] for the 

definition). This yields the formula 

(7.4) d,m(pW) = |0|,E^(ipPP)y^ 
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Now we use the fact that pairs (T, 9) are partitioned in geometric conjugacy classes, defined 
as foUows: two pairs {T,6) and {T',9') are geometricahy conjugate if and only if there exists 
g £ G(Fg) such that T = gT'g^^ and for all n such that g £ G(Fqn), we have 

0{NF^n/F,{x)) = e'{N^^^,^^[g-^xg)) for x G T(F,.), 

(see e.g. |DM| 13.2]). The point is the following property of geometric conjugacy classes: if the 
generalized characters R^{9) and R^,{9') have a common irreducible component, then (T,0) 
and {T',9') are geometrically conjugate (see e.g. |DM| 13.2]). 

In particular, for a given p, if {p,R^{9)) is non-zero for some {T,9), then only pairs {T',9') 
geometrically conjugate to {T,9) may satisfy {p,R^,{9)) ^ 0. So we have 

A- / f ^\ ini \^ ^ {p,eGeTRT{S)) 
for some geometric conjugacy class k, depending on p. By Cauchy-Schwarz, we obtain 

(7.5) ""-(^Mxia^ E_^ (^o(,),^c(,)) ) ■ 

The second term on the right is simply {p{p),p{p)) ^ (p, p) = 1- As for the first term we 
have 

^1 1 1 1 

^ {R^{9),R^{9)) 

by H7.2() . Now it is known that for each class n, the assumption that G has connected center 
implies that the generalized character 



is in fact an irreducible character of G (such characters are called regular characters; see e.g. |Ca| 
Prop. 8.4.7]). This implies that 

and so we have 

\G\ 

(7.6) dimp{p) ^ 



p' 



Now observe that we will have equality in this argument if p is itself of the form ±R^{9), and 
if \T\ = (g — 1)*". Those conditions hold for representations of the principal series, i.e., characters 
R^{9) for an Fg-split torus T and a character 9 "in general position" (see e.g. fUa., Cor. 7.3.5]). 
Such characters are also, more elementarily, induced characters lnd^(0), where B = B(Fq) is a 
Borel subgroup containing T, for some Borel subgroup B defined over Fg containing T (which 
exist for a split torus T) and 9 is extended to B by setting 9{u) = 1 for unipotent elements 
u £ B. For this, see e.g. |Lul Prop. 2.6]. 

Conversely, let p be such that 



dim p 



\G\p' 



and let k be the associated geometric conjugacy class. From the above, for any (T, 9) in k, we 
have \T\ = {q — 1)^, i.e., T is Fg-split. Now it follows from Lemma 17.41 fprobablv well-known) 
that this implies that R^{9) is an irreducible representation, so must be equal to p. 
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We now come to Ai{G). To deal with the fact that in (|7.4|) . \T\ depends on (T,0) G k, we 
write 

(since by H7.2|) . the dependency is rather weak). 

Now summing over p, consider the first term's contribution. Since x('^) is an irreducible 
character, the sum 

p K 

is simply the number of geometric conjugacy classes. This is given by q'''\Z\ by (DMI 14.42] 
or |Cal Th. 4.4.6 (ii)], where r' is the semisimple rank of G and Z = Z(G)(Fq) is the group 
of rational points of the center of G. For this quantity, note that the center of G being 
connected implies that Z{G) is the radical of G (see e.g. |Spl Pr. 7.3.1]) so Z{G) is a torus and 
r = r' + dimZ(G). So using again the bounds ()7.2|1 for the cardinality of the group of rational 
points of a torus, we obtain 

(7.8) \Z\q^' ^{q + iy. 

To estimate the sum of the contributions in the second term, say ^ i(p), we write 

and we bound 

(7.9) |(E'°'^tW)|^(^tW,^tW) 

p 

for any (T,^), since we can write 

R^{e) = ^ a{p)p with a{p) £ Z, 
p 

and therefore 

(7.10) |(E/^'^tW)| = \EM ^ EI«(/')I' = {RTio),R^{e)). 

p p p 

Thus 

E*w<J^^i{(T.«))i. 

There are at most \W\ different choices of T up to G-conjugacy, and for each there are at most 
|r| ^ {q + ly different characters, and so we have 

and 

(T.12) + 

To conclude, we use the classical formula 

\G\ = n (^'^ - 
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where is the number of positive roots of G, and the di are the degrees of invariants of the 
Weyl group (this is because G is spht; see e.g. |Cal 2.4.1 (iv); 2.9, p. 75]). So 

\G\p'= n 

and 

(7.13) = n ^ ^ n ('^ + 1)''^' = + ^''-'^ = + 1)^'-^^^'' 

since Y.{di-l)=N and N ={d-r)/2 (see e.g. (Dil 2.4.1], [Sp| 8.1.3]). 

Inserting this in (|7.6|) we derive the first inequaUty in (|7.1|) . and with (|7.12j) . we get 

^i(G)^(, + l)(^-)/^(l + ^), 

which is the second part of (|7.1|) . 

Now we explain why the extra factor involving the Weyl group can be removed for products 
of groups of type A and C. Clearly it suffices to work with G = GL{n) and G = CSp{2g). 

For G = GL{n), with d = n? and r = n, Gow Go and Klyachko |Klj have proved indepen- 
dently that Ai{G) is equal to the number of symmetric matrices in G. The bound 

Ai{G) ^ (g + l)('^'+")/2 

follows immediately. 

For G = GSp{2g), with d = 2g^ + g +1 and r = g +1, the exact analog of Gow's theorem is 
due to Vinroot 0. Again, Vinroot's result implies Ai{G) ^ (g + 1)(°'+'')/^ in this case (see 
Cor 6.1], and use the formulas for the order of unitary and linear groups to check the final 
bound). □ 

Here is the lemma used in the determination of ^oo(G') when there is a character in general 
position of a split torus: 

Lemma 7.4. Let G/Fg he a split connected reductive linear algebraic group of dimension d 
and let G = G(Fq) he the finite group of its rational points. Let T he a split torus in G, 9 a 
character of T = T(Fg). //T' is also a split torus for any pair (T',6') geometrically conjugate 
to {T,9), then R^{6) is irreducihle. 

Proof. If R^{9) is not irreducible, then by the inner product formula for Deligne-Lusztig charac- 
ters, there exists w € W, w I, such that ^0 = 9 (see e.g. |DM| Cor. 11.15]). Let T' be a torus 
obtained from T by "twisting by if", i.e., T' = g'Tg~^ where 5 G G is such that 5"^ Fr(g) = w 
(see e.g. \Ca\ 3.3]). Let Y = IIom(Gm, T) ~ 7/ (resp. Y') be the abelian group of cocharacters 
of T (resp. T'); the conjugation isomorphism T — > T' gives rise to a conjugation isomorphism 
y — > y' (loc. cit.). Moreover, there is an action of the Frobenius Fr on Y and a canonical 
isomorphism T ~ y/(Fr —1)Y (see e.g. |DM| Prop. 13.7]), hence canonical isomorphisms of the 
character groups T and T' as subgroups of the characters groups of Y and Y': 

f ~ {X : y ^ I (Fr-l)y C ker^}, T" ~ {x : y' ^ C | (Fr-l)y' C kerx}- 

Unraveling the definitions, a simple calculation shows that the condition "^9 = 9 \s precisely 
what is needed to prove that the character x of y associated to 9, when "transported" to a 
character x' of Y' by the conjugation isomorphism, still satisfies kerx' ^ (Fr— l)y' (see in 
particular |Cal Prop. 3.3.4]), so is associated with a character 9' G T". 

Using the characterization of geometric conjugacy in |DMl Prop. 13.8], it is then clear that 
(T,0) is geometrically conjugate to (T',6''), and since 7^ 1, the torus T' is not split. So by 
contraposition, the lemma is proved. □ 
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Example 7.5. (1) Let I be prime, r ^ 1 and let G = GL(r)/F^. Then G = GL{r,F(,), G is a 
split connected reductive of rank r and dimension r^, with connected center of dimension 1. So 
from Lemma l7. II and Proposition 17.21 we get 

Ap{H) ^ (£+l)K'^-l)/2+r/p 

for p G [1, +oo] for any subgroup H of G, and in particular 

^oo (H) ^ (£ + and Ai (H) ^ (£ + lyi^+^y^ . 

It would be interesting to know if there are other values of p besides p = 1, 2 and +oo (the 
latter when q is large enough) for which Ap{GL{n,Fq)) can be computed exactly. 

(2) Let ^ ^ 2 be prime, 5 ^ 1 and let G = CSp{2g)/Fe. Then G = GSp{2g,Fi) and G is a 
split connected reductive group of rank g + 1 and dimension 2g'^ + 17 + 1, with connected center. 
So from Lemma l7.1l and Proposition 17.21 we get 

Ap{H) ^ + 

for p G [1, +00] for any subgroup H of G, and in particular 

A^iH) ^ (£ + if and ^ + lf+9+\ 

In the case of G = SL{r,Fq) or G = Sp{2g,Fq), which correspond to G where the center 
is not connected, the bound for Aoo{G) given by this example is still sharp if we see G as 
subgroup of GL{r, Fg) or CSp{2g, Fg), because both d and r increase by 1, so d — r doesn't 
change. However, for Ai{G), the exponent increases by one. Here is a slightly different argument 
that almost recovers the "right" bound. 

Lemma 7.6. Let G = SL{n) or Sp{2g) over Fq, d the dimension and r the rank of G, and 
G = GiFg). Then we have the following hounds 

and 

MG) ^ (, + i)(^-o/2+./.(i±iy/^(i + 2.ir+ymy/^ 

for any p G [1, +00], where k = n for SL{n) and k = 2 for Sp{2g). 

The first bound is better for fixed g, whereas the second is almost as sharp as the bound for 
GL{n) or GSp{2g) if q is large. 

Proof. As we observed before the statement, this holds for p = +00, so it suffices to consider 
p = \ and then use the same interpolation argument as for Proposition 17.21 

Let Gi = GL{n) or GSp{2g) for G = SL{n) or Sp{2g) respectively, Gi = Gi(Fg). We use 
the exact sequence 

(compare with Section ^ where m is either the determinant or the multiplicator of a symplectic 
similitude. Let p be an irreducible representation of G, and as in the proof of Lemma l7.H let 
tt{p) be any irreducible representation of Gi in the induced representation to Gi. The point is 
that all "twists" tt{p) (8) ip, where if; is a character of lifted to Gi through m, are isomorphic 
restricted to G, and hence each tt{p) (8) V contains p when restricted to G, and contains even 
all p with the same tt{p). So if tt ~ tt', for representations of Gi, denotes isomorphism when 
restricted to G, we have 
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where the sum is over a set of representatives for this equivalence relation. On the other hand, 
dimTT = dirnvr' for vr ~ vr', and for each vr there are |r/r'^| distinct representations equivalent 
to TT, with notation as in Lemma l6.ll Hence, 



AUG) ^ If'^ldim^. 
— 1 ^-^ 



From, e.g., |KoH Lemma 2.3], we know that f"^ has order at most n (for SL(n)) or 2 (for 
Sp{2g)), which by applying Proposition 17.21 yields the first bound^, namely 

(q + l)id+r)/2 

Ai{G) ^ K— '— , with K = 2 orn. 

To obtain the refined bound, observe that in the formula (|7.7j) for the dimension of an 
irreducible representation p of Gi, the first term is zero unless p is a regular representation, and 
the second t{p) is smaller by a factor roughly q. If tt is regular, we have F'^ = 1 by Lemma 17.71 
below. So it follows that 

^i(G) ^ ^-yl ^ dimyr + K ^ dimvrj 

^ TT regular tt not regular 

^^^q^{q-l) + ^ E t{p) 
q -1 ^ , 

n not regular 

(in the first term, q'^{q — 1) is the number of geometric conjugacy classes for Gi, computed as 
in l\7.H^ . since r is the semi-simple rank of Gi). We have the analogue of (|7.11jl : 



TT not regular 

by (|7.L-{j) (because 



IT not regular 

see (|7.1U|) . and the same argument leading to ()7.9() 1. The bound 

A,(G)^(g + l)('^+'-)/^fl+'"^^ + ^)'^' 

V q-1 

follows. □ 

Lemma 7.7. Let G = GL{n) or GSp{2g) over Fq, G = G(Fq). For any regular irreducible 
representation p of G, we have F'' = 1. 

Proof. As above, let Tft '. G — > G^n the determinant or multiplicator character. Let p be a 
regular representation and ij) a character of such that p® "ip — p, where ip is shorthand for 

ip o m. We wish to show that ip is trivial to conclude F'' = 1. For this purpose, write 

for some unique geometric conjugacy class k. We have R^{9)(E)'ip = i?^(0(^|T)) (see, e.g., |DM[ 
Prop. 12.6]), so 

Since the distinct Deligne-Lusztig characters are orthogonal, the assumption p ~ implies 
that for any fixed {T,0) G k, the pair {T,0{iIj\T)) is also in the geometric conjugacy class k. 
Consider then the translation of this condition using the bijection between geometric conjugacy 



This suffices for the applications in this paper. 
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classes of pairs (T,^) and Fg-rational conjugacy classes of semi-simple elements in G*, the 
dual group of G (see, e.g., |DM1 Prop. 13.12]). Denote by s the conjugacy class corresponding 
to (T,^). The pair {T,jp\T) corresponds to a central conjugacy class s' , because ip\T is the 
restriction of a global character of G (see the proof of .DM, Prop. 13.30]; alternately, use the fact 
that both global characters and central conjugacy classes are characterized by being invariant 
under the action of the Weyl group^), and the definition of the correspondance shows that 
{T,9^\T) corresponds to the conjugacy class ss' (which is well-defined because s' is central). 
The assumption that (T, 9) and (T, 9tp\T) are geometrically conjugate therefore means ss' = s, 
i.e, s' = 1, and clearly this means ^ = 1, as desired. □ 

Remark 7.8. Here is a mnemonic device to remember the bounds for Aao{G) in (|7.1j) ^^: among 
the representations of G, we have the principal series R{9), parametrized by the characters of a 
maximal split torus, of which there are about q'^, and those share a common maximal dimension 
A. Hence 

gM2x^dim(i?(0))2>c|G|~(?^ 
e 

so A is of order q^'^~^y^. In other words, we expect that in the formula ^ dim(/?)^ = the 
principal series contributes a positive proportion. 

The bound for Ai{G) is also intuitive: there are roughly conjugacy classes, and as many 
representations, and for a "positive proportion" of them, the degree of the representation is of 
the maximal size given by Aoo{G). 

8. Probabilistic sieves 

The introduction of a general measure space {X, fi) as component of the siftable set may 
appear yo be an instance of overenthusiastic French abstraction. However, we believe that the 
generality involved may be useful and that it suggests new problems in a probabilistic setting. 

To start with a simple example, let ^ = (Z, {primes}, Z Z/^Z) be the classical sieve 
setting. Consider now a probability space {X, S, P) (i.e., P is a probability measure on X with 
respect to a u-algebra S), and let F = : X — > Z be an integer- valued random variable. Then 
the triple {X, P, N) is a siftable set, and given any sieving sets (Q^) and prime sieve support 
C, it is tautological that the measure, or rather probability, of the associated sifted set in X is 
equal to 

P(iV G 5(Z,0;£*)) =P(V G X | N{uj) {modi) Qi, for alU E £*}). 

In other words, the sieve bounds in that context can give estimates for the probability that 
the values of some integer-valued random variable satisfy any condition that can be described 
by sieving sets. 

If we are given natural integer-valued random variables, this probabilistic setting gives a 
precise meaning to such notions as "the probability that an integer is squarefree". If the 
distribution law of N is uniform on an interval 1 ^ n ^ T, and we let T — > +oo, this is just the 
usual "natural density" . 

Example 8.1. Let Nx be a random variable with a Poisson distribution of parameter A, i.e., 
we have 

p(Nx = k) = e-^ — , forfe^O. 
kl 

Then one can easily show, e.g., that the probability that A'';!^ is squarefree (excluding 0) tends 
to 7r^/6 as A goes to oo. 



Think of T in GL{n) being the diagonal matrices, with the Weyl group (S„ permuting the diagonal 
components. 

Which explains why it seemed to the author to be a reasonable statement to look for... 
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The following setting seems to have some interest as a way to get insight into properties of 
"random" integers n G Z. 

Consider a simple random walk Sn, n ^ 0, on Z, i.e., a sequence of random variables Sn on 
X such that = and Sn+i = Sn + Xn+i with (Xn)n^i a sequence of independent random 
variables with Bernoulli distribution P(X„ = ±1) = ^ (or one could take general Bernoulli 
distributions P(X„ = 1) = p, P(X„ = —1) = q, for some p, q g]0, 1[ with p + q = 1). These 
variables (Sn) give a natural sequence of siftable sets {X,P,Sn)- It turns out to be quite easy 
to estimate the corresponding sieve constants; here the dependency on the random variable 
component of the siftable set is the most important, so we denote A(5'„,>C) the sieve constant. 

Proposition 8.2. Let (Sn) be a simple random walk on Z. With notation as above, we have 

'2tt\\ 



A{Sn,jC) ^ 1 + 



cos (-^ J I 



m6£ 



for n ^ 1 and for any sieve support C consisting entirely of odd squarefree integers m ^ L. 

It is natural to exclude even integers, simply because Sn (mod 2) is not equidistributed: more 
precisely, we have P(S'n is even) = or 1 depending on whether n itself is even or odd. In 
probabilistic terms, the random walk is not aperiodic. The simplest way to avoid this problem 
would be to assume that the increments X„ have distribution 

P(X„ = ±1) = P{Xn = 0) = l 

(i.e., at each step the walker may decide to remain still). The reader will have no trouble 
adapting the arguments below to this case, without parity restrictions. 

Proof. We will estimate the "exponential sums" , which in the current context, using probabilistic 
notation E(y) = YdP for the integral, are simply 

'aiSn\ ( a-iSyx 

for mi, mi ^ C, ai ^ {TilrriiTi)^. Using the expression Sn = Xi + ■ ■ ■ + X„ for n ^ 1, 
independence, and the distribution of the Xi, we obtain straightforwardly 

(aim2 — a2mi)Xi' 



W(a,b) = E e 

^ " mim2 

The condition that rrii are odd, and that {ai,mi 
ai = a2 and mi = m2, and otherwise 



\W{a,b)\ ^ 
Hence the sieve constant is bounded by 

A(5„,/:) ^ max|l + V V* 

mi,ai L ^ — ' ^ — ' 

"^2 aj (modm2) 



„ aim2 — a2mi\" 

cos ZTT 

mim2 / 
: 1, imply that |VF(a,6)| 

27r 



1 if and only if 



cos ■ 



mim2 



27r 



cos ■ 



mim2 



f] 



^ 1 



m. 



□ 



Corollary 8.3. With notation as above, we have: 

(1) For any sieving sets ^£ C Z/^Z for £ odd, i ^ L, and L 3, we have 

2 

P{SneS{Z,n;L)) ^ (l + L2exp(-^))i7-i 

where 



H 



m odd 
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(2) Let e > be given, e ^ 1/4. For any odd q ^ 1, any a coprime with q, we have 

1 1 



F{Sn is prime and = a(modg)) <^ 



f{q) logn 

ifn^2, q ^ n^^^^^ , the implied constant depending only on e. 

Note that (2) is Theorem II .11 in the introduction. 

Proof. For (1), we take C to be the set of odd squarefree numbers ^ L (so C* is the set of odd 
primes ^ L), and then since cos(x) ^ 1 — for ^ x ^ 27r/9, the proposition gives 



A ^ 1 + ( 1 - -J I ^ 1 + exp 



and the result is a mere restatement of the large sieve inequality. 

For (2), we have to change the sieve setting a little bit. Consider the sieve setting above 
except that for primes £ | we take to be reduction modulo where v{l) is the 

valuation of q. Take the siftable set {X, P, S'n), and the sieve support 

C = {mm! I mm' squarefree, (m, 2q) = 1, m ^ L/q and m' \ q}, 

with C* still the set of odd primes ^ L. 

Proceeding as in the proof of Proposition |S21 the sieve constant is bounded straightforwardly 

by 



A < 1 + 



'^^^Z^l 2^ 2^mq^l+T{q)q Lexp(^ — jj- 

m^L/g m'\q 
(m,2g)=l 

where r(g) is the number of divisors of q. 
Finally, take 

^ \z/rWZ-{a} if£|g. 
If Sn is a prime number congruent to a mod g, then we have 5„ G ^(Z, 17; £*), hence 
P(5n is a prime = a(modg)) ^ P(S„ e 5(Z,0;£*)) ^ ^H'^ 

where 

m^L/q m'\g £"11^ 
{m,2q)=l 

Now the desired estimate follows on taking q ^ n^/'*^^ and L = gn*^, using the classical lower 
bound (see e.g. |B], |IK1 (6.82)]) 

Y! 1 >^Ml„gL/,»^Ml„g„ 
^-^ ip[m) 2q q 

m^L/q 
{m,2q)=l 

(the implied constant depending only on e) together with the cute identity 

ip*{m') = q 



E 

m'\q 



which is trivially verified by multiplicativity. □ 

Remark 8.4. (1) It is important to keep in mind that, by the Central Limit Theorem, \Sn\ is 
usually of order of magnitude ^/n (precisely, 5„/ y/n converges weakly to the normal distribution 
with variance 1 as n ^ +oo). So the estimate A ^ 1 + exp(— nvr^/L''), which gives a non- 
trivial result in applications as long as, roughly speaking, L ^ n^/^/(log n)^/^, compares well 
with the classical large sieve for integers n ^ N, where A ^ — 1 + L^, which is non-trivial 
for L ^ \^N. 
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(2) The second part is an analogue of the Brun-Titchmarsh inequahty, namely (in its original 
form) 

1 X 

7r(x;g,a) < — 

ip{q) logx 

for x ^ 2, (a, g) = 1 and q ^ x^"*", the implied constant depending only on e > 0. However, 
from the previous remark we see that it is weaker than could be expected, namely q ^ n^/^~^ 
would have to be replaced by g ^ n^/^~^. Here we have exploited the flexibility of the sieve 
setting and sieve support. For a different use of this flexibility, see Section [TTl we want to point 
out here that the possibility of using a careful non-obvious choice of C was first exploited by 
Zywina in his preprint [Z]. 

It would be quite interesting to know if the extension to g ^ n^/^^^ holds. The point is that 
if we try to adapt the classical method, which is to sieve for those k, 1 ^ k ^ x/q, such that 
qk + a is prime, we are led to some interesting and non-obvious (for the author) probabilistic 
issues; indeed, if 5n = a (modg), the (random) integer k such that Sn = kq + a can be described 
as follows: we have k = T/v where is a random variable 

= \{m ^ n I Sm = a (mod q)}\ 

and (Tj) is a random walk with initial distribution given by 

p(ro = 0) = 1 - -, p(ro = -1) = -, 

q q 
and independent identically distributed increments Vi = Ti — Tj_i such that 

F{Vi = 0) = 1 - -, P(y, = ±1) = 

q Zq 

So what is needed is to perform sieve on the siftable set {{Sn = a (mod q)},P, Tjy). Since the 
length of the auxiliary walk is random, this requires some care, and we hope to come back to 
this. Note at least that if look at the same problem with (0, P, Tj) for a fixed i, then we easily 
get by sieving 

P(qTi + a is prime) <^ —rr-, r 

if[q) logt 

for all q ^ i^/^^^, e > 0, the implied constant depending only on e. 

(3) Obviously, it would be very interesting to derive lower bounds or asymptotic formulas for 
P(S„ is prime) for instance, and for other analogues of classical problems of analytic number 
theory. Note that it is tempting to attack the problems with "local" versions of the Central 
Limit Theorem and summation by parts to reduce to the purely arithmetic deterministic case. 
Problems where such a reduction is not feasible would of course be more interesting. 

In the next section, we will give another example of probabilistic sieve, similar in spirit to 
the above, although the basic setting will be rather deeper, and the results are not accessible 
to a simple summation by parts. 

Finally, we remark that this probabilistic point of view should not be mistaken with "prob- 
abilistic models" of integers (or primes), such as Cramer's model: the values of the random 
variables we have discussed are perfectly genuine integers. 

9. Sieving in arithmetic groups 

We now start discussing examples of sieve settings which seem to be either new, or have only 
been approached very recently. The first example concerns sieving for elements in an arithmetic 
group G. There are actually a number of different types of siftable sets that one may consider 
here. 



To give a caricatural example, if it were possible to show that, for some sequence of random variables Nn 
distributed on disjoint subsets of integers, the probability P{N„ and A'^„ + 2 are both primes) is always strictly 
positive, then the twin-prime conjecture would follow. 
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Maybe the most obvious idea for analytic number theorists is to take the group sieve setting 
defined by 

^ = {SL{n, Z), {primes}, G SL{n, F^)) 

(where the last reduction map is known to be surjective for all i), and look at the siftable set 
X G G which is the set of those matrices with norm bounded by some quantity T, with = x 
and counting measure. In other words, instead of sieving integers, we want to sieve integral 
unimodular matrices. Of course, SL(n, Z) may be replaced with other arithmetic groups, even 
possibly with infinite- index subgroups. 

Here the equidistribution approach leads to hyperbolic lattice point problems (in the case n = 
2), and generalizations of those for n ^ 3. The issue of uniformity with respect to q when taking 
"congruence towers" T n T{q), where T{q) is the principal congruence subgroup, is the main 
issue, compared with the results available in the literature (e.g., the work of Duke, Rudnick and 
Sarnak DRS gives individual equidistribution, with methods that may be amenable to uniform 
treatments, whereas more recent ergodic-theoretic methods by Eskin, Mozes, McMullen, Shah 
and other, see e.g. jEMSj . seem to be more problematic in this respect). 

A tentative and very natural application is the natural fact that "almost all" unimodular 
matrices with norm ^ T have irreducible characteristic polynomial, with an estimate for the 
number of exceptional matrices (this question was also recently formulated by Rivin JR?, Conj. 
8], where it is observed that the qualitative form of this statement is likely to follow from the 
results of Duke, Rudnick and Sarnak). The case of integral matrices with arbitrary determinant 
can be treated very quickly as a simple consequence of the higher-dimensional large sieve as 
in in] (in other words, embed invertible matrices in the additive group G = M{n, Z) ~ Z" , 
and use abelian harmonic analysis). 

The setting of arithmetic groups suggests other types of siftable sets, which are of a more com- 
binatorial flavor, and the "probabilistic" theme of Section |H1 is also a natural flt.^^ Theorem 11.21 
gives some first results of this kind. 

Let G be a finitely generated group. Assuming a symmetric set of generators S to be fixed 
(i.e., with = S), three siftable sets {X,fi,F) of great interest arise naturally: 

- the set X of elements g € G with word-length metric is id) at most A'^, for some integer N ^ 1, 
i.e, the set of those elements g £ G that can be written as 

g = si---Sk 

with k ^ N, Si £ S iov 1 ^ i ^ k. Here we take Fx = x for x £ X, and of course fi is the 
counting measure. 

- the set W of words of length N in the alphabet S, for some integer N ^ 1, with the 
"value" in G of the word w € W, i.e., the image of w by the natural (surjective) homomorphism 
F{S) — > G from the free group generated by S to G. Again fi is the counting measure. 

- as in Section IHl we may consider a probabilistic siftable set {Q,F,Xn), where is some 
probability space, P the associated probability measure, and Xn = Ci ' ' '^.N, where (^fe) is a 
A^-uple of S"- valued random variables. The simplest case is when {S,k) is an independent vector, 
and the distribution of each is uniform: P(^fc = s) = l/l/SI. In other words, X]\[ is then the 
A^-th step in the simple left-invariant random walk on G given by S*. If G = Z and S = {±1}, 
we considered this in Section |H1 

Remark 9.1. Note that the last two examples are in fact equivalent: we have 

P{Xr,£A) = j^^\{w£W I F^£A}\ 

for any subset A C G. (Since \W\ = \S\^ , this explains why the two statements of Theorem ll.2l 
are equivalent). Although this reduces one particular probabilistic case to a "counting" sieve, 
we may indeed wish to vary the distribution of the factors of the random walk, and doing so 



A useful survey on combinatorial and geometry group theory is given in the book of de la Harpe |Ha| . and 
a survey of random walks on groups is that of Saloff-Coste |SC] . 
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would not in general lead to such a reduction; even when possible, this may not be desirable, 
because it would involve rather artificial constructs. For instance, another natural type of 
random walk is the random walk given by factors where 



This is also equivalent to replacing 5 by 5" U {1} (if 1 ^ 5 at least), but the set of words w 
where each component Wi may be the identity is not vary natural. 

We now provide a concrete example by proving Theorems 11.21 and 11.31 indeed in a slightly 
more general case. Let G be either SL{n) or Sp{2g) for some n ^ 3 or ^ 2, let G = G(Z), 
and let S* be a symmetric set of generators for G (for instance, the elementary matrices with ±1 
off the diagonal are generators of SL{n,Zi), see Remark 19. 9() .^^ We consider either the group 
sieve setting 

= (G, {primes}, G ^ Ge) 

where Gi = G{Z/£Z) is SL{n,V() or 5p(2(7, F^), and the maps are simply reduction modulo I, 
or the induced conjugacy sieve setting. It is well-known that the reduction maps are onto for 
all i (see e.g. |Shil Lemma 1.38] for the case of SL{n)). 

We will look here at the second type of siftable set T = iW, fi, F), i.e., W is the set of words 
of length N in S, and is the "value" of a word w in G. Equivalently, we consider the 
simple left-invariant random walk (X„). In that case, the qualitative form of Theorem 11.21 was 
proved by Rivin TrJ, and the latest version of Rivin's preprint also discusses quantitative forms 
of equidistribution in Gi, using Property (T) in a manner analogous to what we do. 

We will obtain a bound for the large sieve constant by appealing to 1)6. 4() and its analogue for 
the group sieve setting, estimating the exponential sums W{tt, t) or W{ipT^^ej, fryj') of (|4.2j) ."^^ 
The crucial ingredient is the so-called "Property (r)". 

Proposition 9.2. Let T be a finitely generated group, I an arbitrary index set and pi : T ^ Gi 

for i ^ I a family of surjective homomorphisms onto finite groups, such that T has Property (r) 
with respect to the family (ker/jj) of finite index subgroups ofV. 

Let S = be a symmetric finite generating set ofT, and for N ^ 1, let W = Wat denote 
the set of words of length N in the alphabet S, and let F^ denote the value of the word w inV . 
Assume that there exists a word r in the alphabet S of odd length c such that = Id G T. 

Then there exists a > such that for any i £ I, any representation vr : T — > GL{V) that 
factors through Gi and does not contain the trivial representation, any vectors e, f in the space 
of TT, we have 



^ llellll/lllW^I^-", 



(9.1) 1^ (7r(F^)e,/) 

for N ^ 1, where (•, •) is a T-invariant inner product on V , and hence 

(9.2) I Tr7r(F„) ^ (dim7r)|M/|i-°. 

The constant a depends only on T, \S\, the {T)-constant for (5", r,kerpj) and the length c of 
the relation r. 

We will recall briefly the definition of Property (r) and the associated (r)-constant in the 
course of the proof; see e.g. jLul §4.3] or ;LZ] for more complete surveys. This should also be 
compared with |SC[ Th. 6.15]. 



We will also comment briefly on what happens for G — SL{2, Z) 
O 

simpler 



Of course the equidistribution approach may also be used, but it is less efficient and not really quicker or 
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Proof. Let i £ I and let vr be a representation that factors through Gi and does not contain the 
trivial representation. Clearly ()9.2|) follows from (|9.1() since the trace of a matrix is the sum of 
the diagonal matrix coefficients in an orthonormal basis. 
Let 

^ = T^E^(^)' M' = ld-M, 

which are both self-adjoint elements of the endomorphism ring End(l^), since S = S~^. We 
then find by definition 

(vr(F^)e,/) = (M^e,/). 

Let /> ^ be the spectral radius of M, or equivalently the largest of absolute values of the 
eigenvalues of M, which are real since M is self-adjoint. Then by Cauchy's inequality we have 

K^^e,/)K||e||||/||/,^, 

so that it only remains to prove that there exists 5 > 0, independent of i and vr, such that 
p^l-6. 

Clearly p = max(/9+,p_), where /)+ G R (resp. p-) is the largest eigenvalue and p- is 
the opposite of the smallest eigenvalue (if it is negative) and otherwise. We bound each p± 
separately, proving p^ ^ 1 — d± with 5± independent of i and vr. 

For it is equivalent (by the variational characterization of the smallest eigenvalue) to 
prove that there exists 5+ > 0, independent of i and tt, such that 

{M'{v),v) ^ 
{v,v) 

for any non-zero vector v G V. But a simple and familiar computation yields 

l.Y,J\^^s)v-vf = 2{M'{v),v) 

and therefore tautologically we have 

(9.3) (j£>)l^^ ' i,nnfm,. ll"W"-°"' . 

{v,v) 2\S\ vj^O s&S ||t>P 

where ranges over all unitary representations of T that factor through some ker pi and do 
not contain the trivial representation (and || • || on the right-hand side is the unitary norm for 
each such representation). But it is precisely the content of Property (r) for T with respect to 
(ker Pi) that this triple extremum is > (see e.g. |Lul Def. 4.3.1]). 

So we come to p-. Here a suitable lower-bound follows from Theorem 6.6 of |SCj (due to 
Diaconis, Saloff-Coste, Stroock), using the fact that any eigenvalue of M is also an eigenvalue 
of Mreg, whcre Mreg is the analogue of M for the regular representation of F on L^(F/ker pi). 

For completeness, we prove what is needed here, adapting the arguments to the case of a 
general representation. It suffices to prove that there exists 6- > independent of i and vr such 
that 

(9.4) ffiil > l_ 

{v,v) 

for all non-zero v G V, where now M" = Id -|- M. We have 

2(M"(^;),t;) = ^^||vr(s)^; + t;f. 

Now let r = si • • • Sc be a word of odd length c in the alphabet S such that r is trivial in F; 
denote 

Tk = si ■ ■ ■ Sk ior I ^ k ^ r, tq = 1. 
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For V € V, we can write 



^ (^{v + tt{si)v) - {■jt{si)v + tt{siS2v)) H h (vr(si • • • Sc-i)v + n{l)v)^ 



2 

(the odd length is used here), hence by Cauchy's inequahty we get 

c-1 c-1 



|2 



.\v\\- ^ 4 X] ||vr(rj)t; + 7r{riSi+i)v\\'^ = 4 H'" + 

i=Q i=0 

(the representation is unitary). By positivity, since at worst all Si are equal to the same generator 
in 5, we get 

(9.5) ll^ll' ^ J E + ^11' = ^^{M"{v),v), 

which implies (|9.4|) with 5_ = ^^j^ > 0. □ 

Remark 9.3. The odd-looking assumption on the existence of r is indeed necessary for such a 
general statement, because of periodicity issues. Namely, if (and in fact only if) all 5-relations 
in r are of even length, the Cayley graph^^ C(r, s) of T with respect to S is bipartite^^, and so 
are its finite quotients C(T / kei pi, S). In that case, it is well-known and easy to see that —1 
is an eigenvalue of Mreg (the operator M for the regular representation; take the function such 
that f{x) equals ±1 depending on whether the point is at even or odd distance from the origin) 
and the argument above fails. Alternately, this can be seen directly with the exponential sums: 
the relations being of even length implies that there is a well-defined surjective homomorphism 
e : r — > Z/2Z with e{s) = —1 for s E S". Viewing e as a representation T {il} C C^, we 
have 

{\W\ if N is even 
-\W\ if TV is odd. 



We will describe an example of this for F = SL{2, Z) below. 

The simplest way of ensuring that r exists is to assume that 1 £ S; geometrically, this 
means each vertex of the Cayley graph has a self-loop, and probabilistically, this means that 
one considers a "lazy" random walk on the Cayley graph, with probability 1/\S\ of staying at 
the given element. 

In fact, if we consider the effect of replacing 5 by S" = U {1} (in the case where 1 ^ S), we 
have 

1 \ , . 1 



with obvious notation, and so we obtain 

2 

directly (which is the same lower bound as the one we proved, in the case c = 1). 

With the estimate of Proposition 19. 2( we can perform some sieve. 

Theorem 9.4. Let G = SL{n), n ^ 3, or Sp{2g), g ^ 2, be as before, G = G(Z), and let 

$ = (G, {primes}, G ^ Ge = G{Z/£Z)) 

be the group sieve setting. Let S = S^^ be a symmetric generating set for G, {W, F) the siftable 
set of random products of length N of elements of S. 



^'^ Recall C(r, 5") has vertex set F and as many edges from gi to g2 as there are elements s £ S such that 
92 = gis; this allows both loops and multiple edges, and those will occur if 1 £ 5 or, in the Cayley graphs of 
quotients of F, if two generators have the same image. 

I.e., the vertex set F is partitioned in two pieces F± and edges always go from one piece to another. 
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(1) For any sieve support C, the large sieve constant for the induced conjugacy sieve satisfies 



(9.6) A{W, C) ^ \W\ + |M^|i-°i?(£), 
where a > is a constant depending only on G and S and 

i?(£) =max|Aoo(G„)| x V^i(G„). 

(2) There exists r] > such that 

(9.7) \{u)eW I det(F^ - T) e Z[T] is reducible }| < |T^|^-'' 

where rj and the implied constant depend only on G and S. 

(3) For any sieve support C, the large sieve constant for the group sieve satisfies 

(9.8) A(Ty,£) ^ + |Ty|^-"i?(£), 
where a > is as above and 

i?(£) =max|V^oo(G„)| x y2 A^/^iGnf^^ ■ 

(4) There exists /? > such that 

(9.9) e I one entry of Fyj is a square }| ^ 
where (3 and the implied constant depend only on G and S. 

It is clear that the fourth part imphes Theorem 11.31 

Lemma 9.5. Let G be as above. 

(1) Property (r) holds for the group G = G(Z) with respect to the family of congruence 
subgroups (ker(G — > G(Z/dZ)))rf^i . 

(2) For any symmetric generating set S = , there exists an S -relation of odd length. 

Proof. (1) This is weh-known; in fact, the group G is a lattice in a semisimple real Lie group 
with R-rank ^ 2, and hence it satisfies the stronger Property (T) of Kazhdan, which means 
that in (|9.3() . the infimum may be taken on all unitary representations of G not containing the 
trivial representation and remains > (see, e.g., |HV1 Cor. 3.5], |Lul Prop. 3. 2. 3, Ex. 3.2.4, 
§4.4])). 

(2) If all S'-relations are of even length, the homomorphism 

F{S) - {±1} 

defined by s i— > — 1 induces a non-trivial homomorphism G — > {±1}- However, there is no 
such homomorphism for the groups under consideration (e.g., because its kernel H will be a 
finite index normal subgroup, hence by the Congruence Subgroup Property, due to Mennicke 
and Bass-Lazard-Serre in this case, see T>MS p. 64] for references, will factor through a 
principal congruence subgroup ker(G G(Z/(iZ)) for some integer d ^ 1, defining a non-trivial 
homomorphism^*^ G{Z/dZ) {il}, which is impossible since G(Z/dZ) is its own commutator 
group). □ 

Proof of Theorem \9.4\ (1) Let m, n S £, vr, r G 11%^, 11* respectively. Since the maps G 
G(Z/fiZ) are onto for all d (e.g., because the family (pd) is linearly disjoint in the sense of Def- 
inition 1231 by Goursat's lemma, as in pi Prop. 5.1]), we have in fact G[m,n] = G(Z/[m, n]Z). 

By Lemma 16.41 the representation [7r,f] of G[^„] defined in (|4.3() contains the identity rep- 
resentation if and only if (m,-7r) = (ra,T), and then contains it with multiplicity one. Let ['7r,f]o 
denote the orthogonal of the trivial component in the second case, and [vr, r]o = [vr, f] otherwise. 



^"^ With notation as in Section |7| 

Here we use the fact that G G(Z/dZ) is surjective, see the first line of the next proof. 
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We can now appeal to Proposition EH applied to the representation [vr, f]o o py^.n] of using 
the family {p^ : G G(Z/(iZ)) of congruence subgroups (since ker[7r,f]o o /9[m,n] ^ G\^m,n])- 
The previous lemma ensures that all required assumptions on S and this family are valid, and 
by (|9.2j) . the conclusion is the estimate 



W{t:,t) - 5{t:,t)\W\ ^ (dim7r)(dimr)|l^ 



1-a 



for the exponential sum (|4.1j) . where a depends only on G, S" and the relevant (r) or (T) 
constant. 

By Proposition I2.1U1 we obtain 



A(VF,£) ^ \W\ + |W^|^-"maxAoo(G^)y ^i(G 



as stated. 

(3) This is exactly similar, except that now we use the basis of matrix coefficients for the 
group sieve setting, and correspondingly we appeal to (IQ.lj) and the fact (see the final paragraphs 
of SectionEJ that the sums W{ipT^^ej, ^T,e'j') are (up to the factor (dim vr) (dim r) ) of the type 
considered in (|9.1j) . 

In the case where [tt,t] contains the trivial representation (i.e., if (m, vr) = (n, r)), we also 
use the fact that, when identified with End(T4-)) the one-dimensional space of invariant vectors 
in TT vf = [vr, tt] is spanned by homotheties and the orthogonal projection of a linear map 
u G End(V^) is multiplication by Tr(u)/Vdim7r (this is a corollary of the orthogonality relations; 
note that ||Id|p = dimvr). This means that for a rank 1 linear map of the form u = e®e' (where 
e is in the space of vr, and e' in that of the contragredient), the projection is the multiplication 
by (e, e')/Vdimvr. Since the vectors are part of an orthonormal basis, we get 

(e,eO(/,f) ^ 

dimvr 
'5((e,/),(e',/')) 



((vr ® vf)(e ® e'), f ® f) = ^"'".^^^'^'^ + ([vr, ^]o(e ® e'), / ® f) 

dimvr 



+ ([vr,vr]o(e®e'),/55/'). 
dimvr 

Altogether, we obtain 

W{^^,ej, ^r,e'j') " 5((vr, e, /), (r, e',f'))\W\ ^ v'(dimvr)(dimr)|VF|i-° 



and hence 



I\{W,C) f^\W\ + \W\^-'^ max Vdim^V V Vdimr 



m,7r,e, f 



< + \W\'-^ maxV^oo(G^) V V (dimr) 



5/2 



= + |VF|i-°max V^oo(G„) V ^5/2 (G'„)^/^ 

(2) To obtain H9.7|) . we apply the large sieve inequality for group sieves of Proposition 14.11 
using (|9.6|1 . This is completely standard; without trying to get the sharpest result (see SectionlTTl 
for more refined arguments in a similar end-game), we select the prime sieve support £* = {£ ^ 
L} for some L ^ 2, and take C = C* (pedantically, the singletons of elements of £*...). Letting 
(i = n^ — 1, r = ?7- — 1 (for G = SL{n)) 01 d = 2g^ + g, r = g (for G = Sp{2g)), we have (using 
Lemma IT.fil of Section Ej) 

RijC) « L'^+\ 

for L ^ 2, the implied constant depending only on G. 

We take for sieving sets the conjugacy classes in the set 0^ C G{F£) of matrices with ir- 
reducible characteristic polynomial in Ff[T]. From (1) of Proposition IB. II in Appendix B, we 
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obtain 

(9.10) 1^ » 1 

Iwl 

for £ ^ 3, where the imphed constant depends on G (compare with jCh' §3], 'KoH Lemma 7.2]). 

Since those w £ W iov which det(F^ — T) is reducible are contained in the sifted set 
S{W] Q,C*), we have by Proposition 14.11 

\{w€W \ det(F^ - T) G Z[T] is reducible }| ^ AH'^ < {\W\ + \W\'^-''L'^+^)H-^ , 

where H > 7r(L) by (UnUl) . Taking L = we get the bound stated. 

(4) Clearly, it suffices to prove the estimate for the number oi w £ W for which the (i, j)-th 
component of F^, is a square, where i and j are fixed integers from 1 to n or 2g in the SL{n) 
and Sp{2g) cases respectively. The principle is similar, using (2) to estimate the large sieve 
constant for the sieve where C* = {£ ^ L}, C = C* , with 

^e = {g = {9a,f3) e G{Yi) I Qi^j is not a square in FJ. 

We get by Lemma FTBl the bound 

R{C) « L^+i^d-r)l2 

for L ^ 2, where the implied constant depends only on G. 
Next by (2) of Proposition Ell we have 



\Go 



> 1 



for L ^ 3 (for L = 2, the left-hand side may vanish for SL{2)), where the implied constant 
depends only on G. (The proof in Appendix B uses the Riemann Hypothesis over finite fields; 
the reader may find it interesting to see whether a more elementary argument may be found). 
Hence the sieve bound is 

\{w £W \ the (i, j)-th entry of is a square}] ^ {\W\ + R{L))H~^ 

with H > 7r(L) for L ^ 3, the implied constant depending on G. We take L = | V[^|"/(i+(3'^-^)/2) 
if this is ^ 3 and then obtain (|9.9j) . To deal with those N for which this L is < 3, we just 
enlarge the implied constant in (|9.9|) . □ 

From part (2) of this theorem, we can easily deduce Theorem 11.21 

Corollary 9.6. Let G = SL{n), n ^ 2, or Sp{2g), g ^ 1, let G = G(Z) and let S = S"^ be 

a symmetric generating set of G. Let [X^) he the associated simple left-invariant random walk 
on G. Then almost surely there exist only finitely many k such that det(Xfc ~ is a reducible 
polynomial. 

Part of the point of this statement is that it requires some quantitative estimate for the 
probability that has reducible characteristic polynomial. 

Proof. For n ^ 3 (resp. g ^ 2), it suffices to apply the "easy" Borel-Cantelli lemma^^, since the 
estimate (2) above for = A: is equivalent with 

P{det{Xk — T) is reducible) <C exp(— aA;) 

with a = r]\og \S\ > 0, and this shows that the series 

^P(det(Xfc - T) is reducible) 

converges. From the weaker bound (|9.1H1 in E,emark l9. 101 below, we see that this series remains 
convergent for SL{2, Z) = Sp{2, Z). □ 



^'^ If An are events in a probability space such that the series ^ P(^„) converges, then almost surely lo belongs 
to only finitely many An. 
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The next corollary is a geometric application which answers a question of Maher |Mal Ques- 
tion 1.3], and was suggested by Rivin's paper See [E| for a survey of the mapping class group 
of surfaces, |FLP[ Exp. 1, 9] for information on pseudo-Anosov diffeomorphisms of surfaces. 

Corollary 9.7. Let G be the mapping class group of a closed orientahle surface of genus g ^ 1, 
let S be a finite symmetric generating set of G and let {Xi^), k ^ 1, be the simple left-invariant 
random walk on G. Then the set X C G of non-pseudo-Anosov elements is transient for this 
random walk. 

Proof. We follow the arguments of Rivin. First of all, the mapping class group G may be defined 
as the group of diffeomorphisms of a fixed compact connected surface of genus g preserving 
the orientation, up to isotopy (i.e., homotopy in the diffeomorphism group). The main point is 
that the induced action on the integral homology ifi(Sg,Z), which preserves the intersection 
pairing, yields a surjective map 

p : G^Sp{2g,Z). 

Let 5 be a generating set as above^'^, and let S' = p{S), a finite symmetric generating set for 
Sp{2g, Z). The image = p{Xk) of the random walk on G is a random walk on Sp{2g, Z). 

Note that the steps are independent and identically distributed, but is not necessarily true 
that each is uniformly distributed on S' , which means that we are not exactly in the setting 
of Theorem 19.41 However, we can easily prove the analogue of Proposition 19.21 for any random 
walk on Sp(2g, Z) defined by identically distributed independent steps with the property 
that P(^fc = s') = p{s') > for all s' G S", simply by replacing the self-adjoint operator M with 

M=^pis')7Tis'), 

s£S' 



and using the identities 



p{s')\\Tr{s')v±vf = 2{{ld±M)v,v) 

s'es' 



to obtain the bounds 



((ld — M)v,v) minn(s') . „. „ lltufs)?; 

> mf mf max r— - 



{v,v) 2 v^O ses 

and 

hf < ^ y p(s')\\v + 7r{s')vf = ^—-{{ld + M)v,v) 

" " 4 minp(s') f^/^ ^" ^ ^ " 2minp(s') ^ ' ' 

s 

analogues of (|9.3j) and (|9.5|) . From this, sieve bounds for the random walk (Y^) follow, compa- 
rable to those for the simple random walk. 

Now, we need only use the fact (the "homological criterion for pseudo-Anosov diffeomor- 
phism") that it suffices that the following three conditions on the characteristic polynomial 
P = det(r — p{Xk)) hold for Xk to be pseudo-Anosov: 

- P is irreducible; 

- there is no root of unity which is a zero of P] 

- there is no d ^ 2 and polynomial Q such that P{X) = Q{X'^). 
Accordingly we have 

P(Xfc is not pseudo-Anosov) ^ pi + P2 + Ps 

where pi, p2, Ps are the probabilities that det(T — p{Xk)) satisfy those three conditions. Assume 
first g ^ 2. Then, according to (2) of Theorem 19.41 (adapted to a non-simple random walk), 
there exists a > such that 

pi <^ exp(— afc) 



The existence of such finite generating set is not obvious, of course, and is known as the Dehn-Lickorish 
Theorem; see e.g. |E1 Th. 4.2.D]. 
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for k ^ 1. To estimate p2 and we can use simpler sieves to obtain comparable bounds. For 
P2, since P is an integral polynomial of degree 2g and hence may have only finitely many roots 
of unity as zeros, we need only estimate the sifted sets for the sifting sets 

ne = {geSp{2g,Fe) \ (^^ (mod^)) f det(r - 5)} 

where $rf € Z[X] is the d-th cyclotomic polynomial for some fixed d ^ 1. It is clear (by the 

same local density arguments of Appendix B that were already used) that ^ \Sp{2g,F£)\, 
and hence the sieve again yields p2 <C exp(— qA;) for k ^ 1. 
For p3, we consider similarly 

n'^ = {g(z Sp{2g, Fi) I det(r - g) is not of the form Q{X'^)} 

for some fixed d^ 2. We also trivially have ^ \Sp{2g,F£)\, and ps <^ exp(— afc). 

Now we conclude that P(Xfc is not pseudo-Anosov) <C exp(— a/c), and we can again apply 
the Borel-Cantelli lemma. □ 

Remark 9.8. Maher |Maj proved that the probability that is pseudo-Anosov tends to 1 as 
k — > +00 using rather more of the geometry and structure of the mapping class group, and 
asked about the possible transience of the set of non-pseudo-Anosov elements. However, his 
methods are also more general, and work for random walks on any subgroup of G that is not 
"too small" in some sense. It should be emphasized that his condition encompasses groups 
which seem utterly inapproachable by sieve as above, in particular the so-called Torelli group 
which is the kernel of the homology action p above. It may seem surprising that pseudo-Anosov 
should exist in this subgroup, but Maher's result shows that they remain "generic" (see |FLP1 
p. 250] for a construction which gives some examples, and the observation that Nielsen had 
conjectured they did not exist). It would be interesting to know if the random walk on the 
Torelli group is still transient on the set of pseudo-Anosov elements. 

Remark 9.9. In the most classical sieves, estimating either the analogue of R{C) or H is not a 
significant part of the work, the latter because once is known, which is usually not a problem 
there, it boils down to estimates for sums of multiplicative functions, which are well understood. 

The results we have proved, and an examination of Appendix B show that when performing 
a sieve in some group setting, sharp estimates for R{jC) or for H involve deeper tools. For the 
large sieve constant, this involves the representation theory of the group in non-trivial ways. 
For H, the issue of estimating may quickly become a difficult counting problem over finite 
fields. It is not hard to envision situations where the full force of Deligne's work on exponential 
sums over finite fields becomes really crucial, and not merely a convenience. 

Note that the use of the sharp upper bounds of Proposition 17. in the proof of Theorem 19.41 
is not necessary if one wishes merely to find a bound for the large sieve constant of the type 
\W\ + \ W\^~°'L^ for some a > and A ^ 0: trivial bounds for Ap{G) are sufficient. 

If no exact value of the (T)-constant for G = G(Z) and S is known, the value of a coming from 
Proposition 19.21 is not explicit, so knowing a specific value of A is not particularly rewarding. 
However, in some cases explicit Kazhdan, hence (r), constants are known for the groups we 
are considering. The question of such explicit bounds was first raised by Serre, de la Harpe 
and Valette; the arguments above show it is indeed a very natural question with concrete 
applications, such as explicit sieve bounds. The first results for arithmetic groups are due to 
Burger for SL{3,Z) (see ^HV, Appendice]). 

To give an idea, we quote a result of Kassabov |Ka^ , improving an earlier one of Shalom |Sha| 
Cor. 1]: let G = SL[n, Z) with n ^ 3, and let S be the symmetric generating set (of 2(n^ — n) 
elements) of elementary matrices Eij{zizl) with ±1 in the {i,j)-th. entry. Then, for any unitary 
representation vr of G not containing the trivial representation, and any non-zero vector v in 
the space of tt, there exists s £ S such that ||vr(s)t' — v\\ ^ enll^ll, with e„ = {A2^/n + 920)"-*^. 

The standard commutator relation 



Sl,2(l)[^l,3(l), ^3,2(1)]"' = 1 
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(which uses that n ^ 3...) shows that there are relations of odd length ^ 5 in terms of S. Looking 
at the proof of Proposition 19.21 we see that we can take 5+ = ^ef^\S\~^ and 6- = 25(W^-n) ^ 
This means that for this particular generating set, a in Theorem 19.41 can be taken to satisfy 

^ _ log(l-4|5|-i) ^ 1 



log |5| 8(n2 - n)(2lVH + 460)2 log(2(n2 - n)) 

for n ^ 3. So we have 

\{w£W I det(F^ - T) is reducible}] < 
for N ^ 1, the implied constant depending on n, with r] given by 

a 1 1 

n2 8n^(n — l)(21-^/n + 460)2 log(2(n2 — n)) n^logn 

Coming back to the probabilistic interpretation (which is more suited to what follows), this 
means in particular that if k is of order of magnitude larger than log n, the probability that 
det{Xk — T) is irreducible becomes close to 1. It would be interesting to have a more precise 
knowledge of this "transition time" 

Tn = min{/c ^ 1 I det(Xfc — T) is irreducible}, 

(which, of course, depends also on S). 

Note that, at the very least, with this particular generating set, dei{Xk — T) is reducible for 
k ^ tn where t„ is the first time when all basis vectors have been moved at least once. Since 
multiplying by means moving one of the n basis vectors chosen uniformly, i„ is the stopping 
time for the "coupon collector problem". Besides the obvious bound tn ^ n, it is well-known 
(see, e.g., IX.3.d]) that 

E(t„) = n(logn + 7) +0(1), forn ^ 1, V(t„) ~ C(2)n^ as n ^ +oo. 

The gap between upper and lower bounds for t„ is quite large, and numerical experiments 
strongly suggest that the lower bound is closer to the truth (in fact, it suggests that E(t) might 
be ~ cE{tn) for some constant c > 1 as n — > +oo). In terms of possible improvements, it is 
interesting to note that the order of magnitude of Kassabov's estimate of the Kazhdan constant 
En for this generating set S is optimal, since Zuk has pointed out that it must be ^ at 
least (see |KIial p. 149]). 

Remark 9.10. If G = SL(2,Z), although G does not have Property (T), it is still true that 
Property (r) holds for the congruence subgroups ker(S'L(2,Z) 5L(2, Z/dZ)), by Selberg's 
Ai ^ 1/4 theorem on the smallest eigenvalue of the hyperbolic laplacian acting on congruence 
subgroups of SL(2,Z). However, the second condition of Lemma 11131 is not true. Indeed, there 
is a well-known homomorphism 

5L(2,Z) ^ 5L(2,F2) c^&3^ {±1} 

(where the isomorphism in the middle is obtained by looking at the action on the three lines in 
F2, and e is the signature), and (for instance) the generators 

S -- 

(which are the analogues of the generating set for SL{n, Z) considered in the previous remark) 
all map to transpositions in ©3. So e(r) = —1 for any word of odd length in the alphabet S. 

Still, while this shows that Proposition 19.21 can not be applied, it remains true that for an 
arbitrary symmetric set of generators S of SL{2, Z) and for odd primes i, the Cayley graph of 
SL{2,F£) with respect to S is not bipartite (because any homomorphism S'L(2,F^) — > {±1} 
is still trivial for i ^ 3). Hence this Cayley graph contains some cycle of odd length, which is 
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easily checked to be ^ 2d{£) + 1, where d{£) is the diameter of the Cayley graph. Since we 
have an expander family (by Property (r)), there is a bound 

for i ^ 3, where the implied constant depends only on S (since the (r)-constant, hence the 
expanding ratio, is fixed); see, e.g., |S('| §6.4]. After a look at the character table of SL(2,Fi), 
it is not difficult to check that this leads to sieve constants such that 

A ^ + \W\ exp(^-j^^^|max'i/'(n) ^ V("^)^| 

where L = max£ and c > depends only on S (see ()11.7|) for the definition of ilj{m)). 

For the problem of irreducibility (which is not the most interesting question about quadratic 
polynomials, perhaps...) taking C the odd primes ^ L, this leads to 

(9.11) \{weW \ det(F^ - T) G Z[T] is reducible}] < \W\ exp(-c'ViV) 

where c' > depends only on 5, and as observed already, this proves Corollary 19.61 and Theo- 
rem O for 5L(2,Z). 

Remark 9.11. The work of Bourgain, Gamburd and Sarnak (see |BGSj and Sarnak's slides |Sa3j ) 
is based on another type of sieve settings, which amounts to the following. First, we have a 
finitely generated group F which is a discrete subgroup of a matrix group over Z, acting on an 
affine algebraic variety F/Z. Then the sieve setting is (F • f , {primes}, pi) where F • u is the orbit 
of a fixed element v E V{Z), and p£ is the reduction map to the finite orbit of the reduction in 
^(F^) (with uniform density). The siftable set if a subset Y of the orbit defined by the images 
of elements of F of bounded word-length or bounded norm, with counting measure and identity 
map. 

Remark 9.12. If we consider an abstract finitely generated group F, and wish to investigate by 
sieve methods some of its properties, the family of reduction maps modulo primes makes no 
sense. We want to point out a family (pi) that may be of use, inasmuch as it satisfies the linear 
disjointness condition f Definition 12. 

Let A be the set of surjective homomorphisms 

p : p{G) = H 

where H is a non-abelian finite simple group, and let A C A be a set of representatives for the 
equivalence relation pi ~ p2 if and only if there exists an isomorphism pi{G) — > P2{G) such that 
the triangle 

(9.12) G 




Pi(G) ^p2{G) 

commutes. 

Lemma 9.13. The system {p)p^\ constructed in this manner is linearly disjoint. 

This is an easy adaptation of classical variants of the Goursat-Ribet lemmas, and is left as 
an exercise (see e.g. jEl Lemma 3.3]). 

Fix some vertex x and find two vertices y and z which are neighbors but satisfy d{x,y) = d{x,z) (mod 2) 
(those exist, because otherwise the graph would be bipartite; note that d(x,y) — d{x,z)); then follow a path 71 
of length d{x, y) ^ d{£) from x to y, take the edge from y to z, then follow a path of length d{z, x) — d{x, y) from 
z to a; to obtain a loop of length 2d{x, j/) + 1 ^ 2d[£) + 1. The example of a cycle of odd length, i.e., of the Cayley 
graph of Z/mZ with m odd with respect to S = {±1} shows that this is best possible for arbitrary graphs, 
and the Ramanujan graphs of Lubotzky-Philips-Sarnak give examples of expanding families where diameter and 
length of shortest loop (not necessarily of odd length) are of the same order of magnitude, see |Sa2l Th. 3.3.1]. 
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To make an efficient sieve, it would be necessary, in practice, to liave some knowledge of A, 
such as the distribution of the orders of the finite simple quotient groups of G. This is of course 
in itself an interesting problem (see, e.g., the book |LSj ^. 

10. The elliptic sieve 

The next application is also apparently new, although it concerns a sieve which is a sort of 
"twisted" version of the classical large sieve. Let -E/Q be an elliptic curve given by a Weierstrass 
equation 

+ aixy + a^y = + a2X^ + a^x + qq, where Oj G Z. 

Assuming the rank of E is positive, let A^; be the set of primes £ of good reduction, and for 
i G A^, let Pi : E{Q,) E(Fi) be the reduction map. 

The natural sets X C -E'(Q) for sieving are the finite sets of rational points x E £'(Q) with 
(canonical or naive) height h{x) ^ T for some T ^ (with again counting measure and Fx = x 
for X S X). There are interesting potential applications of such sieves, because of the following 
interpretation: a rational point x = (r, s) £ -E'(Q) (in affine coordinates, so x is non-zero in 
E{Q)) maps to a non-zero point S(F^) if and only if £ does not divide the denominator of the 
affine coordinates r and s of the point. This shows that integral or S'-integral points (in the 
affine model above) appear naturally as (subsets of) sifted sets. 

We use such ideas to prove Theorem 11.41 showing that most rational points have denomi- 
nators divisible by many (small) primes. Recall that uje{x) is the number of primes, without 
multiplicity, dividing the denominator of the coordinates of x, with uje{0) = +oo. We also 
recall the statement: 

Theorem 10.1. Let E/Q be an elliptic curve with rank r ^ 1. Then we have 
(10.1) |{x G ^(Q) I hix) ^ r}| ~ ceT''/^ 

as T ^ +00, for some constant ce > 0, and moreover for any fixed real number k with < k < 
1, we have 

\{x £ E{Q) I h{x) ^ T and iOE{x) < KloglogT}] «; r''/2(log log r)~\ 
for T ^ 3, where the implied constant depends only on E and k. 
Proof. Let M ~ Z'' be a subgroup of -E'(Q) such that 

EiQ) = MeEiQ)tors, 

and let {xi, . . . ,Xr) be a fixed Z-basis of M. Moreover, let M' be the group generated by 
(x2, • ■ ■ ,Xr)- We will in fact perform sieving only on "lines" directed by xi. 

But first of all, since the canonical height is a positive definite quadratic form on ii^(Q) CSR = 
M (8) R, the asymptotic formula (jlO.lf) is clear: it amounts to nothing else but counting integral 
points in M R ~ R*" with norm y^h{x) ^ \/T (this being repeated as many times as there 
are torsion cosets). 

Moreover, we may (for convenience) measure the size of elements in i?(Q) using the squared 
L°°-norm 

11x11^ = max |aj|^, for x = OjXj -|- t with t S E{Q,)tors, 

i.e., we have h{x) x for all x £ M, the implied constants depending only on E. 

Now we claim the following: 

Lemma 10.2. For any fixed k g]0, 1[, any fixed x' G M' , any fixed torsion point t S E{Q)tors, 
we have 

\{x e {t + x')®Zxi I ||x||^ ^ T anrf Wij(x) < K log log T}| < \/T(log log T)~\ 
for T ^ 3, the implied constant depending only on E, k and xi, but not on x' or t. 
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Taking this for granted, we conclude immediately that 

|{x G E{Q) I h{x) ^ T and coEix) < KloglogT}| <C T'^/^(log log r)"\ 

by summing the inequality of the lemma over all x' G M' with ^ T and over all t G 

E{Q)tors (the number of pairs {t,x') is <C T^*""^)/^), the implied constant depending only on E 
and the choice of basis of M. 

Next we come to the proof of this lemma. Fix x' G M' , t G E(Cl)tors- The left-hand side of 
the lemma being zero unless ||t + ^ T, we assume that this is the case. We will use the 
following group sieve setting: 

^ = {Zxi,Ae, Zxi ^ PiiZxi) C pdE{Q))) 



X = {mxi G G I ||t + x' + mxi ||oo = ""^ ^ 



X. 



For any prime ^ G A^;, the finite group is a quotient of Zxi and is isomorphic to Z/zv(£)Z 
where u[t) is the order of the reduction of xi modulo (So this sieve is really an ordinary- 
looking one for integers, except for the use of reductions modulo instead of reductions 
modulo primes). 

Lemma 10.3. Let xi he an infinite order point on £^(Q), and the order of xi modulo i. 
Then all but finitely many primes p occur as the value of for some i of good reduction. 

Proof. For a prime p, consider pxi G E^Q). A prime i of good reduction divides the denominator 
of the coordinates of pxi if and only if p = (mod which means that ^{i) is either 1 or p. 

So if p is not of the form it follows that pxi is an 5-integral point, where S is the union of 
the set of primes of bad reduction and the finite set of primes where xi = (modi). By Siegel's 
finiteness theorem (see, e.g., SiS, Th. IX. 4. 3]), there are only finitely many S-integral points 
in E{Q), and therefore only finitely many p for which p is not of the form I'ii). □ 

(Note that this lemma is also a trivial consequence of a result of Silverman |Sill Prop. 
10] according to which all but finitely many integers are of the form z/(^) for some i; in fact 
Silverman's result depends on a stronger form of Siegel's theorem). 

The lemma allows us to sieve X using as prime sieve support C* the set of £ G A^; is such 
that v^i) is a prime number p ^ L (where, in case the same prime p occurs as values of v^i) 
for two or more primes, we keep only one), and with C = C* (with the usual identification of 
elements with singletons in S{A)). 

From the lemma, it follows that the inequality defining the large sieve constant here, namely 

(10-2) E E* I 2: «MK^)r^^ E 

^e/: a (mod !/(£)) \m\<iVT \m\^VT 

for all (a(m)), can be reformulated as 

p^L a(modp) |m|<v^ |m|<VT 

where ^ in the sum over p indicates that only those p which occur as i'(l) for some i are 

taken into account. We recognize the most standard large sieve inequality, and by positivity, it 
follows that 

A ^ 2Vt + L'^ 
for L ^ 2. We now apply Proposition 13.11 we have 

(10.3) Yl {Pix,C)-P{C)y ^ AQ{C) 

where P{x,C), P{C) and Q{C) are defined in (|3.2j) . for any given choice of sets fi^ C Gi for 
I^Ae. 
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We let = {—pi{t + x')}. By the remark before the statement of Theorem llU.ll we have 
pil{mxi) G if and only if I divides the denominator of the coordinates oi t + x' + mxi, and 
therefore for x = mxi £ X, we have 

P{mxi,C) ^ coEit + x' + mxi). 

On the other hand, we have 

p(^) = E^ = EiJ^ = E^ + 0(1) = log log L + OH) 

for any L ^ 3, because, by Lemma the values v{l) ^ L range over all primes ^ L, with 
only finitely many exceptions (independently of L). 

Hence there exists Lq depending on E, xi and k only, such that if L ^ Lq, we have 

^ i^iogiogL. 

Putting together these two inequalities, we see that if we assume T ^ L^, say, and L ^ L'q 
for some other constant L'q (depending on E, xi and k), then for any mxi G X such that 
t + x' + mxi satisfies coEit + x' + mxi) < k log log T, we have 

(p(x,£)-P(£))'»(loglogr)2, 

the implied constant depending only on E, xi and k. So it follows by positivity from (|1U.3|1 and 
the inequality Q{C) ^ P{C) < log log T that 

\{xet + x®Zxi I ||x||^ ^ T and a;£(x) < KloglogT}! < A(loglogr)"^ 

< (^ + L2)(iogiogr)-i 

for any L ^ L'q. If T^/^ ^ Lq, we take L = T^/"^ and prove the inequality of the lemma directly, 
and otherwise we need only increase the resulting implied constant to make it valid for all T ^ 3, 
since Lq depends only on E, xi and k. □ 

It would be interesting to know whether there is some "regular" distribution for the function 
LVEix). Notice the similarity between the above discussion and the Hardy- Ramanuj an results 
concerning the normal order of the number of prime divisors of an integer (see e.g. jHW[ 22.11]), 
but note that since the denominators of rational points x are typically of size exp/i(x), they 
should have around 

loglogexp(/i(x)) = log(/i(x)) 

prime divisors in order to be "typical" integers. However, note also that the prime divisors 
accounted for in the proof above are all ^ T^/'^ ~ y^h{x) ~ ylogn; it is typical behavior for 
an integer n ^ T to have roughly log log log T prime divisors of this size (much more precise 
results along those lines are known, due in particular to Turan, Erdos and Kac). 

Note also that, as mentioned during the discussion of Proposition 13.11 applying the (appar- 
ently stronger) form of the large sieve involving squarefree numbers would only give a bound for 
the number of points which are i2-integral. Since (for any finite set S), there are only finitely 
many S-integral points, and moreover this is used in the proof of Lemma 111). 31 this would not 
be a very interesting conclusion. 

We can relate this sieve, more precisely Lemma flU.21 to so-called elliptic divisibility sequences, 
a notion introduced by M. Ward and currently the subject of a number of investigations by 
Silverman, T. Ward, Everest, and others (see e.g. |Si2j . |Wj . |EEWj ) . This shows that the 
proposition above has very concrete interpretations. 
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Proposition 10.4. Let (W„)n^o be an unbounded sequence of integers such that 

Wo = 0, Wi = 1, W2Ws / 0, Wa I Wi 
W^+nWm-n = W^+iW^-iW^ - Wn+iWn-iW^, for m ^ n ^ 1, 
A = WiW^^ - WiwP + mjWl^ - 2{)WiWlWl 

+4wiw^ + iGw^wi + swlw^wi + wt / 0. 

Then for any k such that < k < 1, we have 

N 

\{n^N \ oo{Wn) < KloglogiV}! < - — - 

log log iV 

for N ^ 3, where the implied constant depends only on k and (Wn)- 

Proof. This depends on the relation between elliptic divisibility sequences and pairs (E,xi) of 
an elliptic curve E/Q and a point xi G -E'(Q)- Precisely (see e.g. |EEW[ §2]) there exists such 
a pair {E,xi) with xi of infinite order such that if (a„), (bn), (dn) are the (unique) sequences 
of integers with dn ^ 1, {an,dn) = {bn,dn) = 1 and 

_ (On bn 

""""'-UVdi 

then we have 

dn I Wn for n ^ 1 

(without the condition A = 0, this is still true provided singular elliptic curves are permitted; 
the condition that (Wn) be unbounded implies that xi is of infinite order). 

Now the dn are precisely the denominators of the coordinates of the points in Zxi, and we 
have therefore 

w(W„) ^ uj{dn) = oJEinxi). 
Hence Lemma 110.21 gives the desired result. □ 

The "simplest" example is the sequence (Wn) given by 

Wo = 0, Wi = l, W2 = l, W3 = -l, ^4 = 1, 



Wn^lWn-3 + 



Wn = —^^^1^ forn ^ 4 



2 

n-2 



Wn-A 

(sequence A006769 in the Online Encyclopedia of Integer Sequences), which corresponds to case 
of £^ : y'^ — y = x^ — X and xq = (0,0). 

Finally, it will be noticed that the same reasoning and similar results hold for elements of 
non-degenerate divisibility sequences (n„) defined by linear recurrence relations of order 2, e.g., 
Un = a^ — 1 where a ^ 2 is an integer. (The analogue of Silverman's theorem here is a result of 
Schinzel, and the rest is easy). 

11. Sieving for Frobenius over finite fields 

The final example of large sieve we discuss concerns the distribution of geometric Frobenius 
conjugacy classes in finite monodromy groups, refining the arguments and methods in |Kolj . It 
is a good example of a coset sieve as in Section El 

The precise setting is as follows (see also KolJ. Let g be a power of a prime p, let C//Fq 
be a smooth affine geometrically connected algebraic variety of dimension d ^ 1 over Fq. Put 
U = U X Fq, the extension of scalars to an algebraic closure of Fq. 

Let f] denote a geometric generic point of U. We consider the coset sieve with 

(11.1) G = 7n{U,fj), = 7Ti{U,fj), G/G^ ~ Gal(F,/F,) ~ Z, 

so that we have the exact sequence 

1 ^ ^ G ^ Z ^ 1, 
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the last arrow being the "degree" . 

We assume given a family of representations 

pn : iTi{U,f]) GL{r,ki) 

for i in a subset A of the set of prime numbers, where ki is a finite field of characteristic i and 
r is independent of i. By the equivalence of categories between lisse sheaves of A;-modules and 
continuous actions of vri (U, fj) on finite dimensional k vector spaces, this corresponds equivalently 
to a system (Ti) of etale /c^-vector spaces. We then put Gi = lm{p£), the arithmetic monodromy 
group of Pi, so that we have surjective maps G = tti(U, ff) Gp. 

The siftable set we are interested in is given hy X = U(Fq), with counting measure, with 
the map x & G^^ given by the geometric Frobenius conjugacy class at the rational point 

X £ U(Fq) (relative to the field Fg). Since, in the exact sequence above, we have d{Fx) = — 1 G Z 
for all X £ U{Fg), this corresponds to the sieve setting {Y,A, (pi)) where Y, as in Sectional is 
the set of conjugacy classes in 7ri(C/, f/) with degree —1. 

Then, concerning the exponential sums of Proposition 16. 3( we have two basic bounds. 

Proposition 11.1. Assume that the representations {pm) for m £ S{A) are such that, for all 
squarefree numbers m divisible only by primes in A, the map 

MU,fj)^Gf, = llGf 

l\m 

is onto. With notation as before and as in Provosition \6.Sl we have: 

(1) If Gi is a group of order prime to p for all i £ A, then 

W^(7r,r) = 5((m,7r), (n,r))g'^ + 0(g'^-i/2|G[„^^„j|(dim7r)(dimr)) 

for m, n £ S{A), vr £ JI^, r £ 11* , where the implied constant depends only on U. 

(2) If d = 1 {U is a curve) and if the sheaves Ti are of the form Ti = TijiTi for some 
compatible family of torsion- free Zi£-adic sheaves Tp, then 

r) = (^((m, tt), (n, T))q ^ o{q'l'^{'Xmn^){'X\^T)^ 

where the implied constant depends only on the compactly-supported Euler-Poincare character- 
istics of U and of the compatible system [Ti) on U. 

Recall that a system (J^i) of etale sheaves of torsion-free Z^-modules is compatible if, for every 
£, every extension field Fqr ofFg, any f £ [/(F^r), the characteristic polynomial det(l—TFt, | J^i) 
has integer coefficients and is independent of i. Then the Euler-Poincare characteristic Xc{U,J-'e) 
is independent of i, being the degree of the L-function 

II det(l - TFx I fie)-^ 

x£\U\ 

of the sheaf as a rational function (|[/| is the set of all closed points of U). 

Proof. This is essentially Proposition 5.1 of |Kolj . in the case ^ = m, / = n at least. We repeat 
the proof since it is quite short. 

By 1)6. 5|) and the definition of X, we have 

W{tt,t) = —=L= TV([7r,f]p[^,„](F„)) 

where the sum is the sum of local traces of Frobenius for a continuous representation of tti(U, fj). 
We can view [vr, f] as a representation acting on a Q^-vector space for some prime i ^ p, and 
then this expression may be interpreted as the sum of local traces of Frobenius at points in 
C/(Fg) for some lisse Q^-adic sheaf >V(7r,r) on U . 
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By the Grothendieck-Lefschetz Trace Formula (see, e.g., |Grj . |D2j . |Mil VI. 13]), we have then 

2d 

W^(7r,r) = ^_J](-irTr(Fr | Hl{U ,W {t: , r))) 



pTT llprl j^Q 

where Fr denotes the global geometric Frobenius on U . 

Since the representation corresponding to W(vr, r) factors through a finite group, this sheaf is 
pointwise pure of weight 0. Therefore, by Deligne's Weil II Theorem |DH p. 138], the eigenvalues 
of the geometric Frobenius automorphism Fr acting on W(7r, r)) are algebraic integers, 

all conjugates of which are of absolute value ^ g*/^. 

This yields 

W{ti,t) ^ TV(Fr I //f (I7,W(7r,r))) +o((j;(C7,W(7r,r))/-i/2^ 

with an absolute implied constant, where 

2d~-l 

a',{tJ,W{T:,T))) = dim Hi(U,W{7T,T)). 

1=0 

For the "main term" , we use the formula 

H'/{U,W{7T,T)) = V^^^u,f^)i-d) 

where V = Wrf{TT, t) is the space on which the representation which "is" the sheaf acts. But, by 
assumption, when we factor the representation (restricted to the geometric fundamental group) 
as follows 

MU,V) Gf^,„] ^ GL((dim^)(dimT),Q,), 
the first map is surjective. Hence we have 

with W denoting the space of [tTjt]. As we are dealing with linear representations of finite 
groups in characteristic 0, this coinvariant space is the same as the space of invariants, and its 
dimension is the multiplicity of the trivial representation in [7r,r] (acting on W, i.e., restricted 
to ) . By Lemma 16.41 we have therefore 

if (m, vr) 7^ {n,T). Otherwise the dimension is |rj^| and the Tate twist means the global 
Frobenius acts on the invariant space by multiplication by q'^ (the eigenvalue is exactly q'^, not 
a root of unity times q"^ because in the latter case would correspond to a situation where [vr, f] 
vanishes identically on y[m,n]i which is excluded by the choice of 11%^, 11* in Proposition 16. 3|) . 
This gives 

W{tt, t) = 5{im, vr), (n, T))q^ + O (a',{U, W{tt, t)^'^/^' 

with an absolute implied constant. 

To conclude, in Case (1), we appeal to Proposition 4.7 of |Kolj . which gives the desired 
estimate directly. In Case (2), we will apply Proposition 4.1 of |Kolj . but however we argue a 
bit differently^^. Namely, we claim that 

(11.2) a',{U, >V(7r, r)) ^ (dim[7r, f])(l - x + w\S\), 

where x = Xc{U,Qi), and w is the sum of Swan conductors of at the "points at infinity" 
X £ S C [/(Fq), which is independent of £, being equal to 

XrankWi -Xc{U,m) 

^■^ The result there yields a'^{U ,W{n ,t)) ^ (dim[7r, r])(l — x + n])w) and we do not want the term 

uj{[m,n]), which would lead to a loss of log log L below... 
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where both terms are independent of I. This provides the stated estimate (2). 

To check ()11.'2() . we look at the proof of loc. cit. with the current notation, and extract the 
bound 

C7^(C7, >V(7r,r)) ^ (dim[7r,r])('l - X + V max Swan^(>V£ 
Then, by positivity of the Swan conductors, we note that 



Swana;(VV£) ^ Swan^ (VV^ 



w 
xes 

for each x and i \ [m,n] (we use here the compatibility of the system), so that 

max Swajix{yVe) ^ w, 

l\ [m,n\ 

and 

> max SwaD-xiyVi) ^ w\S\ 

^£|[m,n] 

which concludes the proof. □ 

To apply the bounds for the exponential sums to the estimation of the large sieve constants, 
we need bounds for the quantities 

(11.3) max{g'^ + C(dim7r)^VKn]l (dimr)} 
in the first case and 

(11.4) max|g'^ + C7(dim7r)^^ ^ (dimr)} 

n^L renj 

in the second case. 

For this purpose, we make the following assumptions: for all £ S A, and vr G H^, we have 

(11.5) \Gi\^{i + iy, dim7r^(^ + l)^ ^ (dimvr) ^ (^ + 1)*, 

Treni 

where s, t and v are non- negative integers. In the notation of Section d the second and third 
are implied by 

AUG,)^{i + l)\ ^i(G,) ^ (£ + 1)* 

respectively. 

Here are some examples; the first two are results proved in Section [7| (see Example 17. 5|) . 

- if Gi is a subgroup of GL(r, F^), we can take s = r'^, v = r{r — l)/2, t = r{r + l)/2. 

- if G^ is a subgroup of symplectic similitudes for some non-degenerate alternating form of 
rank 2g, we can take s = g{2g + I) + I, v = {s — {g + I)) /2 = g"^ , t = g"^ + g + 1. 

- in particular, if Gi C GL{2,Y^) and G^ = SL{2,¥(), we have 

(11.6) IG^K^^, max(dim^) =^ + 1, ^ (dimvr) ^ (^ + 1)^ 

This particular case can be checked easily by looking at the character table for GL{2, Fi) and 
SL{2,F£). See also the character tables of GL{3,F£) and GL(4, F^) in |^ for those cases. 

Remark 11.2. In jKolj . we used different assumptions on the size of the monodromy groups and 
the degrees of their representations. The crucial feature of (|11.5j) is that yli(G^) and Aoo{Gi) 
are bounded by monic polynomials. Having polynomials with constant terms > 1 would mean, 
after multiplicativity is applied, that Aoo{Gm) and Ai{Gm) would be bounded by polynomials 
times a divisor function; on average over m, this would mean a loss of a power of logarithm, 
which in large sieve situation (as above with irreducibility of zeta functions of curves) is likely 
to overwhelm the saving coming from using squarefree numbers in the sieve. In "small sieve" 
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settings, the loss from divisor functions is reduced to a power of log log \X\, which may remain 
reasonable, and may be sufficient justification for using simpler but weaker polynomial bounds. 

Let 

(11.7) ij{m) =Yl{i+l). 
It follows by multiplicativity from (|11.5|) that we have 

(11.8) |G„|^'0(m)^ dirnvr ^V(m-)^ ^ (dimvr) ^ ^(m)*, 
for all squarefree m. 

We wish to sieve with the prime sieve support C* = {£ € A \ £ ^ L} for some L. The 
first idea for the sieve is to use the traditional sieve support Ci which is the set of squarefree 
integers m ^ L divisible only by primes in A. However, since we have iplm) <^ m log log m, and 
this upper bound is sharp (if m has many small prime factors), the use of Ci leads to a loss of 
a power of a power of log log L in the second term in the estimation of ()11.3|) and (|11.4|) . As 
described by Zywina |Zj, this can be recovered using the trick of sieving using only squarefree 
integers m free of small prime factors, in the sense that tp{m) ^ L + 1 instead of m ^ L (which 
for primes i remain equivalent with i ^ L). This means we use the sieve support 

C = {m £ S{A) I m is squarefree and ip{m) ^ L + 1}. 

We quote both types of sieves: 

Corollary 11.3. With the above data and notation, let Vti C Gi, for all primes £ S A, he a 
conjugacy-invariant subset of such that d{0,i) = —1. Then we have both 

(11.9) \{u G [/(F,) I peiFu) i ne for £ ^ L}\ ^ [q" + Cq^-^'^L + 1)^)H-' 
and 

(11.10) \{u G U{Fg) I pe{F^) i VL, for£^ L}\ ^ {q'' + Cq''-^'^ L^{\og\ogLy)K-\ 



where 



and 

(i) If p\\Gi\ for all ^ G A, we can take A = v + 2s + t+\, and the constant C depends only 
on U . 

(ii) If d = 1 and the system {pi) arises by reduction of a compatible system ofZ£-adic sheaves 
on U , then we can take A = t + v + 1, and the constant C depends only on the Euler-Poincare 
characteristic of U , the compactly- supported Euler-Poincare characteristic of the compatible 
system {Wi) on U, and on s, t, v in the case of (|ll.lflj) . 

Proof. From Proposition I2.1U1 we must estimate 

A = maxV^ V \W{7r,T)\, 

m,Tr ^ — ^ ^ — ^ 

n Ten* 

where m and n run over C and £i, respectively. By Proposition Ill.lT this is bounded by the 
quantities H11.3() and H11.4() . Using ()11.8|1 . the result is now straightforward, using (in the case 
of 1)11. lUj) ) the simple estimate 

J^%(n)^«L^+i 

for L ^ 1, A ^ 0, the implied constant depending on A. □ 
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This theorem can be used to get a shght improvement of the "generic irreducibihty" results 
for numerators of zeta functions of curves of |Kolj (see Section 6 of that paper for some context 
and in particular Theorem 6.2): a small power of logq is gained in the upper bound, as in 
Gallagher's result O Th. C]. We only state one special case, for a fixed genus (see the remark 
following the statement for an explanation of this restriction). 

Theorem 11.4. Let Fq be a finite field of characteristic p ^ 2, let f £ Fq[X] be a squarefree 
monic polynomial of degree 2g, g ^ 1. For t €Fq which is not a zero of f, let Pt £ Z[T] be the 
numerator of the zeta function of the smooth projective model of the hyperelliptic curve 

(11.11) Ct : = f{x){x-t), 

and let Kf be the splitting field of Pt over Q, which has degree [Kt : Q] ^ 2^51!. Then we have 
\{t G I fit) / and [Kt : Q] < 2^5!}! « q'-^logq)'-' 

where 7 = (4^^ + 2g + 4)^^ and 6 > 0, with 6 ~ l/(4g) as g ^ +00. The implied constant 
depends only on g. 

Proof. Let S{f) C Fq be the set in question. Proceeding as in Section 8 of jKolj . we set up 
a sieve using the sheaves J-£ = R^iT\Fi for £ > 2, tt denoting the projection from the family 
of curves (|11.11|) to the parameter space U = {t I /(t) / 0} C A^. Those sheaves are tame, 
obtained by reducing modulo i a compatible system, and the geometric monodromy of J^£ is 
Sp{2g,F() by a result of J.K. Yu (which also follows from a recent more general result of C. 
Hall see |Ko3j for a write-up of this special case of Hall's result). Using (|11.3|) . and the 
proof of Proposition II 1 . l1 to bound explicitly the implied constant, we obtain 

\S{f)\^{q + Ag^L^)H-^ 

where A = 2g'^ + g + 2 (see Example 17. 5|) and 



the sets being defined as in |KoH §7,8]. For each of these we have 

l\ Sj ^ / 1 

for i 3, for some di G]0, 1[ which is a "density" of conjugacy classes satisfying certain con- 
ditions, either in the group of permutations on g letters or the group of signed permutations 
of 2g letters (this follows easily from IG, §2] and Sections 7, 8 of |Kolj ). The implied constant 
depends only on g. Precisely, we have 

A 1 A 1 A , 1 



25' logfif' ^/2^Tg 

as g ^ +00 (see |Kol[ §8]). 

Thus, we need lower bounds for sums of the type 

(3{m)= /?M^'M 

where /3 is a multiplicative function, roughly constant at the primes. This is a well-studied area 
of analytic number theory. We can appeal for instance to Theorem 1 of (LWj : in the notation 
of loc. cit., we have /(m) = //^(m)/?(m), g{m) = ipi^rn), with k = 5i/{l — Si), r/ = 1, a = 1, 
e = 1, a' = 1, 9' = 0, t{p) = 0, C3 = 0. We obtain 

(11.13) Y^ /3(m) >L(logL)-i+'^>/(i-^»), 

ip(m)^L 

for L ^ 3, where the implied constant depends only on g. 
Taking L = g^/^A^ 

upper bound for S{f) then follows. □ 

56 



Remark 11.5. In |Kolj . we obtained a result uniform in terms of g. Here it is certainly possible 
to do the same, by checking the dependency of the estimate H11.13|) on g. However, notice that 
the gain compared to |Kolp ^ is of size (log qY with 5 ~ l/{4g), and this becomes trivial as soon 
as g is of size log log q. This is a much smaller range than the (already restricted) range where 
the estimate of |Kolj is non-trivial, namely g somewhat smaller than ^/^ogq. 

Now we prove Theorem 11.51 stated in the introduction. 

Proof of Theorem M.fA We can certainly afford to be rather brief here. The sieve setting and 
siftable set are the same as in Theorem lll.4l The number of points of Ct and Jt are given by 

|Ct(Fg)| = (7 + l-Tr(Fr | H\Ct,'L,)), | Ji(Fg)| = | det(l - Fr | H\Cu7.,))1 

(for any prime ^ f p). Thus, defining sieving sets 

ni = {geCSp{2g,¥,) \ g is g-symplectic and det(5' - 1) is a square in F^}, 

= {geCSp{2g,¥,) \ g is Q'-symplectic and q + l- Tr(5r) is a square in F^}, 

(where g-symplectic similitudes are those with multiplicator (7), we have for any prime sieve 
support C* the inclusion 

{t E Fq I f{t) / and \St{Yq)\ is a square} C S{U,^l^]C), 

for S G {C, J}. By (3) and (4), respectively, of Proposition IB . II in Appendix B, we have 

l^^^l ^ 1 / I \9 



\Sp{2g,Fe)\ ^ 2\e+l. 
for i ^ 3. Thus if £ is the set of odd prime ^ L, we obtain 

\{teFg I fit) / and \St{Fg)\ is a square}] ^ {q + Ag^L^)H-^ 
where A = 2g'^ + g -\-2, and 



\Spi2g,¥A\ 2 ^ V^ + 1 



By the mean-value theorem we have 



^ l-T^ + 0{g\£ + l)-') 



•+1J i + l 

for £ ^ 3, g ^ 1, with an absolute implied constant, and thus by the Prime Number Theorem 
we have 

H^^7r{L) + 0{gloglogL + g^) 
with an absolute implied constant. For L ^ g^log2g, this gives 



21ogL 

with an absolute implied constant, and 

\{t e Fg I f{t) / and \St{Fg)\ is a square}] < g^{q + q^/^L^)L-\logL), 

this time with no condition on g and L as this is trivial when L ^ Cg'^ log 2g (which explains the 
poorer dependency on g than follows from what we said). So choosing L = we obtain 

the uniform estimate 

\{t£Fg I f{t) ^ and \St(Fg)\ is a square}] < g^-'^(logg) 

with 7 = l/(4g^ + 2g + 4) and where the implied constant is absolute. □ 

See also the end of Appendix A for a lower bound sieve result on the same families of curves. 



It seems that the exponent of q is better, but this reflects the use of the "right" bounds for degrees of 
representations of flnite symplectic groups, and this exponent can be obtained with the method of |Kol| also. 
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Appendix A: small sieves 

If we are in a general sieving situation as described in Section |2 we may in many cases be 
interested in a lower bound, in addition to the upper bounds that the large sieve naturally 
provides. For this purpose we can hope to appeal to the usual principles of small sieves, at least 
when A is the set of prime numbers and for some specific sieve supports. We describe this for 
completeness, with no claim to originality, and refer to books such as |HRj . the forthcoming |IFj 
or |IKl §7] for more detailed coverage of the principles of sieve theory. 

We assume that our sieve setting is of the type 

^ = Ci^, {primes}, {pe)), 

and our sieve support will be the set C of squarefree numbers d < L for some parameter L. We 
write S{X]Q,L) for the sifted set S{X;il., C). The siftable set is {X,iJ,,F) as before. 
Let 

l\ra 

for m squarefree, and for an arbitrary integrable function x i— > denote 



S'rf(A;a)= / a{x)dfi{x). 
'{Pd(i^^)end} 



For X € X, let n{x) ^ 1 be the integer defined by 

n{x) = n ^ 

SO that for squarefree d £ C, we have Pd[Fx) G if and only if d \ n{x) 
Then we have 

/ a{x)dp{x) = I a{x)dp{x) 

Js(X;n,L) J{{n{x),P{L))=l} 



( / a{x)dp{x)^ = an 

(n,P{L))=l •^{"(^)="} ^n,P{L))=l 



where P{L) is the product of primes i < L and 

ttn = a{x)dp{x). 

J {n{x)=n} 

Note that 



XI an = Sd{X]a). 

n=0 (mod d) 

Let now (A^) be two sequences of real numbers supported on C such that Af = 1 and 

d\n d\n 

for n ^ 2. Then, if a{x) ^ for all x, we have 

E «n^E( E ^l)«" = E^l( E an)=Y.\-^,Sd{X-a) 

(n,P(L))=l n d\{n,P{L)) d<L n=0(modd) d<L 

and similarly 

E an^^X^SdiX-a). 

(n,P(L))=l d<L 

It is natural to introduce the approximations (compare (|2.9|) ) 



(A.l) Sd{X; a) = M^d)H + rd{X; a), 
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(where is the a density as in Section [21), which is really a definition of rrf(X;a), where the 
"expected main term" is 



H = I a{x)dfi{x). 
Ix 



Then, in effect, we have proved: 



Proposition A.l. Assume a{x) ^ for all x € X . Let be arbitrary upper and lower-bound 
sieve coefficients which vanish for d^ L. We have then 

V-{n)H - R~{X; L) ^ / a{x)dfi{x) ^ V+{n)H + L) 



where 



V^in) = A>d(J^d) and R^{X- L) = ^ |A^rrf(X; a) 

d<L d<L 



In fact this is not quite what is needed for applications, because V^{X) are not yet in a 
form that makes them easy to evaluate. This next crucial step (called a "fundamental lemma") 
depends on the choice of (which is by no means obvious) and on properties of Qd- For 
instance, we have the following (see e.g. |IK| Cor. 6.2]; note this by no means the most general 
or best result known). 

Proposition A.2. Let k > and y > 1. There exist upper and lower-bound sieve coefficients 
(A^), depending only on k, and y, supported on squarefree integers < y, bounded by one in 
absolute value, with the following properties: for all s ^ 9k 1 and L^'^+i < y^ we have 



L 
L 



a{x)dfx{x) < (l + e^''+^-'K^^) T] {1 - ue{^}i))H + R+{X; L'), 

S{X;n,L) ^ ^ 



KL 



a{x)dfi{x) > (l- e^'^+^-'K^^) ff (1 - M^i))H + R' {X; L'), 

SiX;n,L) ^ ^ 



KL 



provided the sieving sets (Vli) satisfy the condition 

logw 



(A.2) TT (l-Mnf,))-^ ^ k(^^^Y, for allw and L, 2^w < L <y, 



for some K ^ 0. 

In standard applications, rfi{X; a) should be "small", as the remainder term in some equidis- 
tribution theorem. Note again that this can only be true if the family (p^) is linearly disjoint. 
If this remainder is well-controlled on average over d < D, for some D (as large as possible) 
we can apply the above for L such that L^ < D (with s ^ 9k + 1). Note that when s is large 
enough (i.e., L small enough), the coefficient 1 it e^'^+'^^'^K^^ will be close to 1, in particular it 
will be positive in the lower bound. 

Note that the condition (|A.2|) holds if viiQ.^) is of size Ki~^ on average. This is the traditional 
context of a "small sieve" of dimension k; we see that in the abstract framework, this means 
rather that the sieving sets are "of codimension 1" in a certain sense. The important case 
K = 1 (the classical "linear sieve") corresponds intuitively to sieving sets defined by a single 
irreducible algebraic condition. 

Note also that the factor 

n (1 - ^^(^^)) 

is the natural one to expect intuitively if is interpreted as the probability of pi{Fx) being 

in and the various £ being independent. To see the connection with the quantity in 
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the large sieve bound l\2A\i . note that if C is the full power set of the prime sieve support £*, 
then multiplicativity gives 

k fit '^(^^ - ^ '^(^^ - ^ 1 - ^^(^^) 

(recall I'eiYi) = 1). So H^^ has exactly the same shape as the factor above. Of course, as in 
small sieves, if C is as large as the power set of £*, the large sieve constant will be much too 
big for the large sieve inequality to be useful, and so "truncation" is needed. 
We conclude with a simple application, related to Theorem 11.51 and Section ITTl 

Proposition A. 3. Let q be a power of a prime number p ^ 5, g ^ 1 an integer and let f G Fg[T] 
be a squarefree polynomial of degree 2g. Fort not a zero of f , let Ct denote the smooth projective 
model of the hyperelliptic curve = f{x){x — t), and let Jt denote its Jacobian variety. There 
exists an absolute constant a ^ such that 

\{u e Fg I f{t) ^ and \Ct{Fg)\ has no odd prime factor < q'^}\ ^ 



log (7 
q 

\{u G Fg I f(t) / and \ Jt{Fq)\ has no odd prime factor < q"'}\ 3> 

for any 7 such that 

7-1 > 0(2^2 +5 + 1) (log log 3(7), 

where the implied constants depends only on g and 7. 

In particular, for any fixed g, there are infinitely many points t £Fq such that \Ct{F ^dcg(t))\ 
has at most a{2g'^ + 9 + 1) (log log 85) + 2 prime factors, and similarly for \JtiFgdce(t))\. 

Remark A. 4. (1) It may well be that | Jt(Fg)| is even for all t, since if / has a root xq in Fq, it 
will define a non-zero point of order 2 in Jt(Fq). 

(2) See e.g. |('oj for results on almost prime values of group orders of elliptic curves over Q 
modulo primes; except for CM curves, they are conditional on GRH. 

Proof. Obviously we use the same coset sieve setting and siftable set as in Theorems 111.41 and 
Theorem 11.51 and consider the sieving sets 

ni = {geCSp{2g,Fe) \ g is g-symplectic and det(g - 1) = G F^}}, 

nf = {geCSpi2g,F,) \ g is g-symplectic and Tr(5r) = q + 1}}, 

for i ^ 3, where S E {C, J}. By (5) and (6) of Proposition IB. II we have 

from which ()A.2|) can be checked to hold with k = 1 and K <^ log g (consider separately primes 
£ < g and £ ^ g). 

Coming to the error term R" {X; L), individual estimates for rd{X; a) with a{x) = 1 amount 
to estimates for the error term in the Chebotarev density theorem. Using Proposition Ill.l1 in 
the standard way we obtain 

rd{X;a) « gq'/'\nl\'/' « gq'^^ [tP{df''^'d3''v{dr'y^\ 
with absolute implied constants (see also |Ko2l Th. 1.3]), and hence 

R-{X-L') < gq^/'^L'^'^9^+9+^^/^{\og\ogLy^+9, 

for any s ^ 1, with an absolute implied constant. 

Let s = log 2 + 10 log X <C log log 3^, and let e > be arbitrarily small. Then we can take 

L = q(''(29'+9+l))-'-£ 
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in the lower bound sieve, which gives 

\{t £ Fg I f{t) / and |St(Fg)| has no odd prime factor < L}\ 



»g n (1 - i^K^^f )) » 9 n 



£<L 3g<i<L ^ ' 

i/^{nS)<i 

provided L > 3g, say, the impUed constant being absolute. Putting all together, the theorem 
follows now easily. □ 

Appendix B: local density computations over finite fields 

In Sections |H1 and in the previous Appendix, we have quoted various estimates for the 
"density" of certains subsets of matrix groups over finite fields, which are required to prove lower 
(or upper) bounds for the saving factor H in certain applications of the large sieve inequalities. 
We prove those statements here, relying mostly on the work of Chavdarov |Chj to link such 
densities with those of polynomials of certain types which are much easier to compute. In one 
case, however, we use the Riemann Hypothesis over finite fields to estimate a multiplicative 
exponential sum. 

Proposition B.l. Let i ^ 3 be a prime number. 

(1) Let G = SL{n, F^) or G = Sp{2g, F^), with n ^ 2 or g ^ 1. Then we have 

-^\{g eG I det{g - T) G F^X] is irreducible}] > 1 
\G\ 

where the implied constant depends only on n or g. 

(2) Let G = SL[n,Y() or G = Sp{2g,F£), with n ^ 2 or g ^ 1, let i, j be integers with 
^ ^ i, j ^ n or 1 ^ i, j ^ 2g respectively. Then we have 

= {9a,i3) G G I gij G F^ is not a square}] > 1 

where the implied constant depends only on n or g. 

(3) Let G = CSp{2g,F£) with g ^ 1, and denote by m(g) G F^ the multiplicator of a 
symplectic similitude g £ G. Then for any q G F^ , we have 

1 1 f £ \9 

\Sp{2g Fi] ^^^ ^ ^ ' "^^^^ ^ ^ "'^'^ ^^^^^ -1) is a square in Fe}\ ^ 2 VTTT/ ' 

(4) Let G = CSp{2g, Fi) with g I. Then for any q G F^ , we have 
G G I m{g) = q and q + 1 — TT{g) is a square in F^H ^ - 



(5) Let G = CSp{2g,F() with g ^ I. Then for any q G F^, we have 

1 / 
-\{g G G I m{g) = q and det{g — 1) = 0}| ^ mini 1, 



(6) Let G = CSp{2g, Fi) with g ^ I. Then for any q G F^ , we have 

1 / 
-|{(7 G G I m{g) = q and q + 1 — Ti{g) = 0}| ^ mini 1, 



\Sp{2g,Fey'^ . v^.. . . ' - V ' - 1) 

Proof. (1) (Compare with |Ch| §3], |KoH Lemma 7.2]). Take the case G = SL{n,F(:), for 
instance. We need to compute 

^ \{geG\ det{g-T) = {-irf}\, 



\G\ 
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where / runs over the set of irreducible monic polynomials / G F£[T] of degree n with 
/(O) = 1. For each /, we have 

\{geG \ det{g - T) = f}\ » ^ 

by the argument in |Chl Th. 3.5] (note that the algebraic group SL{n) is connected and simply 
connected), and the number of / is close to as £ — > +00, by identifying the set of / with 

the set of Galois-orbits of elements of norm 1 in F^n which are of degree n and not smaller. All 
this implies the result for G, the case of the symplectic group being similar. 
(2) By detecting squares using the Legendre character, we need to compute 

2\G\ ^ I 

where (j) is the non-trivial quadratic character of F^. Let G be the algebraic group SL{n) or 
Sp{2g) over F^, d its dimension (either — 1 or Ig"^ — g). Since G n {gi^j = 0} is obviously a 
proper closed subset of the geometrically connected affine variety G, the affine variety 

G,j = G - G n = 0} 

over F^ is geometrically connected of dimension d, and we have 

|Gm(F£)I = \{9 e G(F,) I 5,,, / 0}| » |G(F,)|, 

for ^ ^ 3. This means that it is enough to prove 

geG,,,-(F<) 

for £ ^ 3, the implied constant depending only on G. Such a bound follows (for instance) 
from the fact that this sum is a multiplicative character sum over the F^-rational points of the 
geometrically connected afhne algebraic variety Gjj- of dimension d. 

Instead of looking for an elementary proof (which may well exist), we invoke the powerful 
£-adic cohomological formalism (see e.g. |IK| 11.11] for an introduction, and compare with the 
proof of Proposition II 1 . l]) . Using the (rank 1) Lang-Kummer sheaf /C = C gij_^ we have by the 

Grothendieck-Lefschetz trace formula 

2d 

E (^)= E Tr(Fr,,,|/C) = J]Tr(Fr|/?,^(G,„/C)) 

where Fr^^^ (resp. Fr) is the local (resp. global) geometric Frobenius for g seen as defined over 
F^ (resp. acting on the cohomology of the base-changed variety to an algebraic closure of F^). 
By Deligne's Riemann Hypothesis (see, e.g., |[IK| Th. 11.37]), we have 

V (^) « g'^dim/Zf (G.,„/C) + V dunH^(G,„lC) 



peGi,j(F(,) k<2d 

< q'^ dim i^f (Gij, /C) + 

for ^ ^ 3, by results of Bombieri or Adolphson-Sperber that show that the sum of dimensions 
of cohomology groups is bounded independently of i (see, e.g., |IK| Th. 11.39]). 

It therefore remains to prove that H^'^{Gij,}C) = 0. However, this space is isomorphic (as 
vector space) to the space of coinvariants of the geometric fundamental group of Gij acting on a 
one-dimensional space through the character which "is" the Lang-Kummer sheaf IC. This means 
that either the coinvariant space is zero, and we are done, or otherwise the sheaf is geometrically 
trivial. The latter translates to the fact that the traces on /C of the local Frobenius Fr^^^i- of 
rational points g G Gjj(Ffi') over all extensions fields Fgi^/Fi depend only on z^, i.e., the map 



9 ^ 
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on Gjj(F£i^) depends only on v. But this is clearly impossible for SL{n) or Sp{2g) with n ^ 2, 
g ^ 1 (but not for SL{1) or for SL{2, F2)...)) because we can explicitly write down matrices even 
in G{F£) both with Qij a non-zero square and gij not a square (taking £ ^ 3 for 5^(2, F^)). 

(3) and (4): those are similar to (1). Namely, define first a g-symplectic polynomial / in 
F£[X] to be one of degree 2g such that^^ 

/(0) = 1, and {qTfSf{l/{qT)) = f{T). 

We can express such a g-symplectic polynomial uniquely in the form 

/(T) = 1 + ai(/)r + • • • + ag_i(/)r^-i + ag{f)T9+ 

qag.i{g)rJ+^ + ■■■ + q^-^ a^{f)T^9-i + qST^g ^ 

with ai{f) € F^, and this expression gives a bijection 

/^(ai(/),...,ag(/)) 

between the set of g-symplectic polynomials and F^. 
Then we need to bound 

where we have put (in case (3) and (4) respectively) 

^'■'^^ = {/ £ I / is (7-symplectic and /(I) is a square in F^}, 

j^(4) _ g F^[T] I / is (7-symplectic and q + 1 — ai{f) is a square in F^}. 
Now it is easy to check that we have 

(B.2) 10^1 = —-^ ^ - 

\ J I I 2 2 

for 7 = 3 or 4 (recall i is odd). Indeed, treating the case 7 = 3 (the other is similar), we have 

'/(I) 



\^^'^\ = \{f I /(i) = o}i + ^ E (1 + 



The first term is £^ ^ since / /(I) is a non-zero linear functional on Ff. The first part of 
the second sum is (£^ — i^~^)/2, and the last is 



E E 

(12,.. ..ag) 09^-/(1) 



where /(I) is defined by /(I) = ag-)-/(l) (note that /(I) depends only on (02, . . . , a^)). Because 
of the summation over the free variable ag, this expression vanishes. 

Now appealing to Lemma 7.2 of |Kolj (itself derived from the work of Chavdarov), we obtain 

(B.3) ,a J M l{g£g I det(l-rg) = /}| ^ 



|5p(25,F,)|'^" ' ^ ^^'-{e + l)g 

for all g-symplectic polynomials /, and hence the stated bound follows by ()B.1|) . (|B.2|1 . (jB7 
(5) and (6): this is again similar to (3) and (4), where we now deal with 



Unfortunately, this is not stated correctly in |Kol| . although none of the results there are affected by this 

slip... 
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with now 

5^(5) = {f (z F^[T] I / is g-symplectic and /(I) = 0}, 
Q{e) = {/ g Fi[T] I / is g-symplectic and g + 1 = ai(/)}. 

We have in both cases l^"^! = £^~^ since the condition is a hnear one on the coefficients. By 
the proof of Lemma 7.2 of |Kolj we also have 

\{geG\ det{l -Tg) = f}\^j^^ 

for all /, and therefore 

Since the quantity to estimate is also at most 1 for trivial reasons, we have the desired result. □ 
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